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Preface 


This book is intended as a text for a course in analysis, at the senior or 
first-year graduate level. 

A year-long course in real analysis is an essential part of the preparation 
of any potential mathematician. For the first half of such a course, there 
is substantial agreement as to what the syllabus should be. Standard topics 
include: sequence and series, the topology of metric spaces, and the derivative 
and the Riemannian integral for functions of a single variable. There are a 
number of excellent texts for such a course, including books by Apostol [A], 
Rudin [Ru], Goldberg [Go], and Roy den [Ro], among others. 

There is no such universal agreement as to what the syllabus of the second 
half of such a course should be. Part of the problem is that there are simply 
too many topics that belong in such a course for one to be able to treat them 
all within the confines of a single semester, at more than a superficial level. 

At M.I.T., we have dealt with the problem by offering two independent 
second-term courses in analysis. One of these deals with the derivative and 
the Riemannian integral for functions of several variables, followed by a treat- 
ment of differential forms and a proof of Stokes’ theorem for manifolds in 
euclidean space. The present book has resulted from my years of teaching 
this course. The other deals with the Lebesque integral in euclidean space 
and its applications to Fourier analysis. 

Prequisites 

As indicated, we assume the reader has completed a one- term course in 
analysis that included a study of metric spaces and of functions of a single 
variable. We also assume the reader has some background in linear algebra, 
including vector spaces and linear transformations, matrix algebra, and de- 
terminants. 

The first chapter of the book is devoted to reviewing the basic results from 
linear algebra and analysis that we shall need. Results that are truly basic are 
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stated without proof, but proofs are provided for those that are sometimes 
omitted in a first course. The student may determine from a perusal of this 
chapter whether his or her background is sufficient for the rest of the book. 

How much time the instructor will wish to spend on this chapter will 
depend on the experience and preparation of the students. I usually assign 
Sections 1 and 3 as reading material, and discuss the remainder in class. 

How the book is organized 

The main part of the book falls into two parts. The first, consisting of 
Chapter 2 through 4, covers material that is fairly standard: derivatives, the 
inverse function theorem, the Riemann integral, and the change of variables 
theorem for multiple integrals. The second part of the book is a bit more 
sophisticated. It introduces manifolds and differential forms in R n , providing 
the framework for proofs of the n-dimensional version of Stokes’ theorem and 
of the Poincare lemma. 

A final chapter is devoted to a discussion of abstract manifolds; it is 
intended as a transition to more advanced texts on the subject. 

The dependence among the chapters of the book is expressed in the fol- 
lowing diagram: 


Chapter 1 The Algebra and Topology of R n 
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Certain sections of the books are marked with an asterisk; these sections 
may be omitted without loss of continuity. Similarly, certain theorems that 
may be omitted are marked with asterisks. When I use the book in our 
undergraduate analysis sequence, I usually omit Chapter 8, and assign Chap- 
ter 9 as reading. With graduate students, it should be possible to cover the 
entire book. 

At the end of each section is a set of exercises. Some are computational in 
nature; students find it illuminating to know that one can compute the volume 
of a five-dimensional ball, even if the practical applications are limited! Other 
exercises are theoretical in nature, requiring that the student analyze carefully 
the theorems and proofs of the preceding section. The more difficult exercises 
are marked with asterisks, but none is unreasonably hard. 
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The Algebra and Topology of R” 


§1. REVIEW OF LINEAR ALGEBRA 


Vector spaces 

Suppose one is given a set V of objects, called vectors. And suppose 
there is given an operation called vector addition, such that the sum of the 
vectors x and y is a vector denoted x + y. Finally, suppose there is given an 
operation called scalar multiplication, such that the product of the scalar 
(i.e., real number) c and the vector x is a vector denoted cx. 

The set V, together with these two operations, is called a vector space 
(or linear space) if the following properties hold for all vectors x, y, z and 
all scalars c, d: 

(1) x + y = y + x. 

(2) x + (y + z) = (x + y) + z. 

(3) There is a unique vector 0 such that x -f 0 = x for all x. 

(4) x + (— l)x = 0. 

(5) lx = x. 

(6) c(dx) = (cd)x. 

(7) (c + d)x = cx + dx. 

(8) c(x + y) = cx + cy. 


1 
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One example of a vector space is the set R n of all n-tuples of real numbers, 
with component-wise addition and multiplication by scalars. That is, if x = 
and y = ( 2 / 1 , . • • ,2/n), then 


x + y = (»i + y u . . . , x n 4- 2/n), 
cx = (cxi,... t cx n ). 

The vector space properties are easy to check. 

If V is a vector space, then a subset W of V is called a linear subspace 
(or simply, a subspace) of V if for every pair x,y of elements of W and every 
scalar c, the vectors x + y and cx belong to W. In this case, W itself satisfies 
properties (l)-(8) if we use the operations that W inherits from V, so that 
W is a vector space in its own right. 

In the first part of this book, R” and its subspaces are the only vector 
spaces with which we shall be concerned. In later chapters we shall deal with 
more general vector spaces. 

Let V be a vector space. A set ai , . . . , a m of vectors in V is said to 
span V if to each x in V, there corresponds at least one m-tuple of scalars 
c i? • - • ? c m such that 

x = Ciax + h c m a m . 

In this case, we say that x can be written as a linear combination of the 
vectors ai , . . . , a m . 

The set ai , . . . , a m of vectors is said to be independent if to each x in 
V there corresponds at most one m-tuple of scalars Ci, . . . ,c m such that 


x — Ciai -1 b c m a m . 

Equivalently, {ai, . . . ,a m } is independent if to the zero vector 0 there corre- 
sponds only one m-tuple of scalars di, . . . ,d m such that 

0 = d\a.i + • • * + C?m a m J 

namely the scalars d\ = c ?2 = ■ ■ ■ = d m = 0. 

If the set of vectors ai, . . . ,a m both spans V and is independent, it is 
said to be a basis for V. 

One has the following result: 

Theorem 1.1. Suppose V has a basis consisting of m vectors. 
Then any set of vectors that spans V has at least m vectors, and any set 
of vectors of V that is independent has at most m vectors. In particular, 
any basis for V has exactly m vectors. □ 

If V has a basis consisting of m vectors, we say that m is the dimension 
of V . We make the convention that the vector space consisting of the zero 
vector alone has dimension zero. 
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It is easy to see that R n has dimension n. (Surprise!) The following set 
of vectors is called the standard basis for R n : 

e 1 = (l,0,0,...,0), 
e 2 = (0,1,0,..., 0), 

e n = (0,0,0, , 1). 

The vector space R" has many other bases, but any basis for R n must consist 
of precisely n vectors. 

One can extend the definitions of spanning , independence, and basis to 
allow for infinite sets of vectors; then it is possible for a vector space to have 
an infinite basis. (See the exercises.) However, we shall not be concerned with 
this situation. 

Because R n has a finite basis, so does every subspace of R n . This fact is 
a consequence of the following theorem: 

Theorem 1.2. Let V be a vector space of dimension m. I] W is 
a linear subspace of V (different from V), then W has dimension less 
than m. Furthermore, any basis for W may be extended to a 

basis ai , . . . , a* , a^+i , . . . , a m for V. □ 

inner products 

If V is a vector space, an inner product on V is a function assigning, 
to each pair x, y of vectors of V, a real number denoted (x, y), such that the 
following properties hold for all x, y, z in V and all scalars c: 

(1) (x,y) = (y,x). 

(2) (x + y,z) = <x,z) 4- (y, z). 

(3) (cx,y) = c(x,y) = (x,cy). 

(4) (x,x) > 0 ifx/0. 

A vector space V together with an inner product on V is called an inner 
product space. 

A given vector space may have many different inner products. One par- 
ticularly useful inner product on R n is defined as follows: If x = (xi, . . . , x n ) 
and y = (2fi , - - ■ , 2/„), we define 

(x,y) = x x y x + •■• + x n y n . 

The properties of an inner product are easy to verify. This is the inner prod- 
uct we shall commonly use in R n . It is sometimes called the dot product; 
we denote it by (x,y) rather than x • y to avoid confusion with the matrix 
product, which we shall define shortly. 
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If V is an inner product space, one defines the length (or norm) of a 
vector of V by the equation 


INI = N.y) 1/2 - 

The norm function has the following properties: 

(1) | |x| | > 0 if x / 0. 

(2) ||cx|| = |c|||x||. 

(3) ||x + y|| < ||x|| 4- ||y||. 

The third of these properties is the only one whose proof requires some work; 
it is called the triangle inequality. (See the exercises.) An equivalent form 
of this inequality, which we shall frequently find useful, is the inequality 

(30 l|x-y||>IMI-lly|l- 

Any function from V to the reals R that satisfies properties (l)-(3) just 
listed is called a norm on V. The length function derived from an inner 
product is one example of a norm, but there are other norms that are not 
derived from inner products. On R n , for example, one has not only the familiar 
norm derived from the dot product, which is called the euclidean norm, but 
one has also the sup norm, which is defined by the equation 

| x [ = max{]2i|,. . . , 

The sup norm is often more convenient to use than the euclidean norm. We 
note that these two norms on R n satisfy the inequalities 

|x| < ||x|| < vfc|x|. 


Matrices 

A matrix A is a rectangular array of numbers. The general number 
appearing in the array is called an entry of A. If the array has n rows and m 
columns, we say that A has size n by m, or that A is “an n by 771 matrix. 
We usually denote the entry of A appearing in the i th row and j th column by 
a,ij\ we call i the row index and j the column index of this entry. 

If A and B are matrices of size n by m, with general entries and 6jj, 
respectively, we define A + B to be the n by m matrix whose general entry 
is dij + bij , and we define cA to be the n by m matrix whose general entry 
is Cdij. With these operations, the set of all n by m matrices is a vector 
space; the eight vector space properties are easy to verify. This fact is hardly 
surprising, for an n by m matrix is very much like an nm-tuple; the only 
difference is that the numbers are written in a rectangular array instead of a 
linear array. 
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The set of matrices has, however, an additional operation, called matrix 
multiplication. If A is a matrix of size n by m, and if B is a matrix of size 
m by p, then the product A • B is defined to be the matrix C of size n by 
p whose general entry c,y is given by the equation 


m 

Cij — ^ ^ 
k—1 


This product operation satisfies the following properties, which are straight- 
forward to verify: 

(1) A (B C) = {A B) C. 

(2) A-(B + C) = A B + A C. 

(3) (A + B) C-A C + B C. 

(4) (cA) -B = c(AB) = A- ( cB ). 

(5) For each k , there is a k by k matrix such that if A is any n by m 
matrix, 

I n • A — A and A ■ I m — A. 

In each of these statements, we assume that the matrices involved are of 
appropriate sizes, so that the indicated operations may be performed. 

The matrix Ik is the matrix of size k by k whose general entry 6{j is 

defined as follows: 6{j = 0 if i / j, and Sij = 1 if i = j. The matrix is 

called the identity matrix of size k by k\ it has the form 

’1 0 ... O' 

0 1 ... 0 

h = i 

.0 0 ... 1. 

with entries of 1 on the “main diagonal” and entries of 0 elsewhere. 

We extend to matrices the sup norm defined for n- tuples. That is, if A 
is a matrix of size n by Tn with general entry , we define 

|A| = max{Ja li? |; i— 1 and j = 1, . . . ,m}. 

The three properties of a norm are immediate, as is the following useful result: 

Theorem 1.3. If A has size n by m, and B has size m by p , then 

\A-B\ < m\A\\B\. □ 
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Linear transformations 

If V and W are vector spaces, a function T : V — ► W is called a linear 
transformation if it satisfies the following properties, for all x, y in V and 
all scalars c: 

(1) T(x + y) = r(x)+T(y). 

(2) T(cx) = cT(x). 

If, in addition, T carries V onto W in a one-to-one fashion, then T is called 
a linear isomorphism. 

One checks readily that if T : V — ► W is a linear transformation, and if 
S : W — + X is a linear transformation, then the composite S o T : V — * X is 
a linear transformation. Furthermore, if T : V —* W is a linear isomorphism, 
then T~ l ; W — * V is also a linear isomorphism. 

A linear transformation is uniquely determined by its values on basis 
elements, and these values may be specified arbitrarily. That is the substance 
of the following theorem: 

Theorem 1.4. Let V be a vector space with basis ai,...,a m . Let 
W be a vector space. Given any m vectors bi,...,b m in W, there is 
exactly one linear transformation T : V — ► W such that, for all i, 
T( a,-) = b i. □ 

In the special case where V and W are “tuple spaces” such as R m and 
R n , matrix notation gives us a convenient way of specifying a linear transfor- 
mation, as we now show. 

First we discuss row matrices and column matrices. A matrix of size 1 
by n is called a row matrix; the set of all such matrices bears an obvious 
resemblance to R n . Indeed, under the one-to-one correspondence 

(x u ...,x n ) — »[&!••• x n ] 

the vector space operations also correspond. Thus this correspondence is a 
linear isomorphism. Similarly, a matrix of size n by 1 is called a column 
matrix; the set of all such matrices also bears an obvious resemblance to R n . 
Indeed, the correspondence 


(# 1 , . . • , Xn) 


’ %1 ' 
- X n - 


is a linear isomorphism. 

The second of these isomorphisms is particularly useful when studying 
linear transformations. Suppose for the moment that we represent elements 
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of R m and R n by column matrices rather than by tuples. If A is a fixed n by 
m matrix, let us define a function T : R m — ► R n by the equation 

T(x) = A * x. 

The properties of matrix product imply immediately that T is a linear trans- 
formation. 

In fact, every linear transformation of R m to R n has this form. The proof 
is easy. Given T, let bi, . . . ,b m be the vectors of R n such that T(ej) = bj. 
Then let A be the n by m matrix A = [bi • • • b m ] with successive columns 
bi, . . . ,b m . Since the identity matrix has columns ei, . . . , e m , the equation 
A • I m = A implies that A • ej = bj for all j. Then A • e ; = T(ej) for all j; 
it follows from the preceding theorem that A ■ x = T(x) for all x. 

The convenience of this notation leads us to make the following conven- 
tion: 

Convention. Throughout, we shall represent the elements of R” 
by column matrices, unless we specifically state otherwise. 

Rank of a matrix 

Given a matrix A of size n by m, there are several important linear spaces 
associated with A. One is the space spanned by the columns of A, looked 
at as column matrices (equivalently, as elements of R n ). This space is called 
the column space of A, and its dimension is called the column rank of A. 
Because the column space of A is spanned by m vectors, its dimension can 
be no larger than ra; because it is a subspace of R", its dimension can be no 
larger than n. 

Similarly, the space spanned by the rows of A, looked at as row matrices 
(or as elements of R m ) is called the row space of A, and its dimension is 
called the row rank of A. 

The following theorem is of fundamental importance: 

Theorem 1.5. For any matrix A, the row rank of A equals the 
column rank of A. □ 

Once one has this theorem, one can speak merely of the rank of a matrix 
A, by which one means the number that equals both the row rank of A and 
the column rank of A. 

The rank of a matrix A is an important number associated with A. One 
cannot in general determine what this number is by inspection. However, 
there is a relatively simple procedure called Gauss- Jordan reduction that 
can be used for finding the rank of a matrix. (It is used for other purposes 
as well.) We assume you have seen it before, so we merely review its major 
features here. 
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One considers certain operations, called elementary row operations, 
that are applied to a matrix A to obtain a new matrix B of the same size. 
They are the following: 

(1) Exchange rows and i 2 of A (where i\ ^ if). 

(2) Replace row z’i of A by itself plus the scalar c times row i 2 (where 
ii / if). 

(3) Multiply row i of A by the non-zero scalar A. 

Each of these operations is invertible; in fact, the inverse of an elementary 
operation is an elementary operation of the same type, as you can check. One 
has the following result: 

Theorem 1.6. If B is the matrix obtained by applying an elemen- 
tary row operation to A, then 

rank B — rank A. □ 


Gauss- Jordan reduction is the process of applying elementary operations 
to A to reduce it to a special form called echelon form (or stairstep form), 
for which the rank is obvious. An example of a matrix in this form is the 
following: 

r® * * * * * ~| 


B = 


0 

0 

0 


® 

0 

0 


* * * * 


© 


0 0 


* * 

F”o 


Here the entries beneath the “stairsteps” are 0; the entries marked * 
may be zero or non-zero, and the “corner entries,” marked ©, are non-zero. 
(The corner entries are sometimes called “pivots.”) One in fact needs only 
operations of types (1) and (2) to reduce A to echelon form. 

Now it is easy to see that, for a matrix B in echelon form, the non-zero 
rows are independent. It follows that they form a basis for the row space of B , 
so the rank of B equals the number of its non-zero rows. 

For some purposes it is convenient to reduce B to an even more spe- 
cial form, called reduced echelon form. Using elementary operations of 
type (2), one can make all the entries lying directly above each of the corner 
entries into 0’s. Then by using operations of type (3), one can make all the 
corner entries into l’s. The reduced echelon form of the matrix B considered 
previously has the form: 


c = 


1 0 
1 


0 


0 0 0 


0 * 
0 * 
1 * 


* 

* 

* 


0 0 0 0 0 0 
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It is even easier to see that, for the matrix C, its rank equals the number 
of its non-zero rows. 

Transpose of a matrix 

Given a matrix A of size n by m, we define the transpose of A to be 
the matrix D of size m by n whose general entry in row i and column j is 
defined by the equation dij = aj{. The matrix D is often denoted A tr . 

The following properties of the transpose operation are readily verified: 

(1) (A tr ) tr = A. 

(2) (A + B) tr = A tr + B tr . 

(3) (A • C) tr = C tr ■ A tr . 

(4) rank A tr = rank A. 

The first three follow by direct computation, and the last from the fact that 
the row rank of A tr is obviously the same as the column rank of A. 


EXERCISES 


1. Let V be a vector space with inner product (x,y) and norm ||x|| = 
(x,x) 1/2 . 

(a) Prove the Cauchy- Schwarz inequality (x,y) < Ml llyll* [Hint: 
If x, y ^ 0, set c = 1 /||x|| and d = l/||y|| and use the fact that 
||cx ± dy|| > o.] 

(b) Prove that ||x -f- y|| < ||x|| + ||y||. [Hint: Compute (x + y,x + y) 
and apply (a).] 

(c) Prove that ||x - y|| > ||x|| - ||y||. 

2. If A is an n by m matrix and B is an m by p matrix, show that 


\A B\ < m\A\ \B\. 

3. Show that the sup norm on R 2 is not derived from an inner product on R 2 . 
[Hint: Suppose (x, y) is an inner product on R 2 (not the dot product) 
having the property that |x| = (x,y) 1/2 . Compute (x ± y, x ± y) and 
apply to the case x = ei and y = e 2 -] 

4. (a) If x = (a:i , x 2 ) and y = (t/i , t/ 2 ), show that the function 


(x, y) = [Xi x 2 ] 


2 - I 


yi 

-I 1 


_y 2 _ 


is an inner product on R 2 . 
*(b) Show that the function 


(x, y) = Oi x 2 ] 


a b 


yi 

b c 


2/2 


is an inner product on R 2 if and only if b 7 — ac < 0 and a > 0. 
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*5. Let V be a vector space; let {a Q } be a set of vectors of V, as a ranges over 
some index set J (which may be infinite). We say that the set {a a } spans 
V if every vector x in V can be written as a finite linear combination 


x — c ai a a i + b Ca k a afc 


of vectors from this set. The set {a Q } is independent if the scalars are 

uniquely determined by x. The set {a a } is a basis for V if it both spans 

V and is independent. 

(a) Check that the set R“of all “infinite-tuples” of real numbers 

x = (a?i,a? 2> ...) 

is a vector space under component- wise addition and scalar multipli- 
cation. 

(b) Let R°° denote the subset of R w consisting of all X = (xi, £ 2 , • • •) 
such that Xi — 0 for all but finitely many values of i. Show R°° is a 
subspace of R“; find a basis for R°°. 

(c) Let T be the set of all real- valued functions /: [a, b] — ► R. Show that 
T is a vector space if addition and scalar multiplication are defined 
in the natural way: 

(/ + £)(*) = /(*) + S(*), 

(c/)(a?) = cf{x). 

(d) Let J~b be the subset of T consisting of all bounded functions. Let 
T\ consist of all integrable functions. Let Tc consist of all continuous 
functions. Let JF D consist of all continuously differentiable functions. 
Let T? consist of all polynomial functions. Show that each of these 
is a subspace of the preceding one, and find a basis for !Fp. 

There is a theorem to the effect that every vector space has a 
basis. The proof is non-constructive. No one has ever exhibited 
specific bases for the vector spaces R w , T , Tb, T\, ^c, Fd- 

(e) Show that the integral operator and the differentiation operator, 


(//)(*)= / f(t)dt and (Df)(x) = f'(x), 

J a 

are linear transformations. What are possible domains and ranges of 
these transformations, among those listed in (d)? 
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§2. MATRIX INVERSION AND DETERMINANTS 

We now treat several further aspects of linear algebra. They are the following: 
elementary matrices, matrix inversion, and determinants. Proofs are included, 
in case some of these results are new to you. 

Elementary matrices 

Definition. An elementary matrix of size n by n is the matrix ob- 
tained by applying one of the elementary row operations to the identity ma- 
trix 

The elementary matrices are of three basic types, depending on which 
of the three operations is used. The elementary matrix corresponding to the 
first elementary operation has the form 

1 

1 

0 ... 1 
1 ... 0 

1 

1 

The elementary matrix corresponding to the second elementary row operation 
has the form 


. . . c 

\ row z'i 
y row i*2 

... 1 

1 

1 




\ row ii 
y row 
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And the elementary matrix corresponding to the third elementary row oper- 
ation has the form 


1 


E" = 


1 


A 


\ row i. 


1 


1 


One has the following basic result: 

Theorem 2.1. Let A be an n by m matrix. Any elementary 
row operation on A may be carried out by premultiplying A by the 
corresponding elementary matrix. 

Proof. One proceeds by direct computation. The effect of multiplying A 
on the left by the matrix E is to interchange rows iy and i 2 of A. Similarly, 
multiplying A by E' has the effect of replacing row i x by itself plus c times 
row i 2 . And multiplying A by E " has the effect of multiplying row i by A. □ 


We will use this result later on when we prove the change of variables 
theorem for a multiple integral, as well as in the present section. 

The inverse of a matrix 

Definition. Let A be a matrix of size n by m; let B and C be matrices 
of size m by n. We say that B is a left inverse for A if B • A = / m , and we 
say that C is a right inverse for A if A • C = I n • 

Theorem 2.2. If A has both a left inverse B and a right inverse 
C, then they are unique and equal. 

Proof Equality follows from the computation 

C = I m ■ C = (B ■ A) ■ C = B - (A C) = B - I n = B. 

If B\ is another left inverse for A, we apply this same computation with B\ 
replacing B . We conclude that C ~ B i; thus B\ and B are equal. Hence B 
is unique. A similar computation shows that C is unique. □ 
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Definition. If A has both a right inverse and a left inverse, then A is 
said to be invertible. The unique matrix that is both a right inverse and a 
left inverse for A is called the inverse of A, and is denoted A -1 . 

A necessary and sufficient condition for A to be invertible is that A be 
square and of maximal rank. That is the substance of the following two 
theorems: 


Theorem 2.3. 
then 


Let A be a matrix of size n by m. If A is invertible , 
n — m = rank A. 


Proof ’ Step 1. We show that for any k by n matrix D, 
rank ( D • A) < rank A. 

The proof is easy. If R is a row matrix of size 1 by n, then R - A is a row 
matrix that equals a linear combination of the rows of A, so it is an element 
of the row space of A. The rows of D ■ A are obtained by multiplying the 
rows of D by A. Therefore each row of D • A is an element of the row space 
of A. Thus the row space of D ■ A is contained in the row space of A and our 
inequality follows. 

Step 2. We show that if A has a left inverse B , then the rank of A 
equals the number of columns of A. 

The equation I m — B • A implies by Step 1 that m — rank ( B • A) < 
rank A. On the other hand, the row space of A is a subspace of m-tuple 
space, so that rank A < m. 

Step 3. We prove the theorem. Let B be the inverse of A. The fact 
that B is a left inverse for A implies by Step 2 that rank A = m. The fact 
that B is a right inverse for A implies that 

B tr . A tr = jtr = jr^ 

whence by Step 2, rank A = n. □ 


We prove the converse of this theorem in a slightly strengthened version: 
Theorem 2.4. Let A be a matrix of size n by m. Suppose 

n = m = rank A. 

Then A is invertible; and furthermore, A equals a product of elementary 
matrices. 
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Proof. Step 1. We note first that every elementary matrix is invert- 
ible, and that its inverse is an elementary matrix. This follows from the fact 
that elementary operations are invertible. Alternatively, you can check di- 
rectly that the matrix E corresponding to an operation of the first type is its 
own inverse, that an inverse for E ' can be obtained by replacing c by — c in 
the formula for E' , and that an inverse for E" can be obtained by replacing 
A by l/A in the formula for E" . 

Step 2. We prove the theorem. Let A be an n by n matrix of rank n. 
Let us reduce A to reduced echelon form C by applying elementary row 
operations. Because C is square and its rank equals the number of its rows, 
C must equal the identity matrix I n . It follows from Theorem 2.1 that there 
is a sequence E\,. ,.,E k of elementary matrices such that 

E k lE k - 1 (-~(E 2 (E l -A))-)) = I n . 

If we multiply both sides of this equation on the left by E k , then by E k _±, 
and so on, we obtain the equation 

A = Et 1 -E; 1 -"E; 1 ; 

thus A equals a product of elementary matrices. Direct computation shows 
that the matrix 

B — E k ■ E k „ i - - Ei 

is both a right and a left inverse for A. □ 


One very useful consequence of this theorem is the following: 

Theorem 2.5. If A is a square matrix and if B is a left inverse 
for A, then B is also a right inverse for A. 

Proof. Since A has a left inverse, Step 2 of the proof of Theorem 2.3 
implies that the rank of A equals the number of columns of A. Since A is 
square, this is the same as the number of rows of A, so the preceding theorem 
implies that A has an inverse. By Theorem 2.2, this inverse must be B. □ 

An n by n matrix A is said to be singular if rank A < n\ otherwise, 
it is said to be non-singular. The theorems just proved imply that A is 
invertible if and only if A is non-singular. 

Determinants 

The determinant is a function that assigns, to each square matrix A, a 
number called the determinant of A and denoted det A. 
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The notation \A\ is often used for the determinant of A, but we are using 
this notation to denote the sup norm of A. So we shall use “det A ” to denote 
the determinant instead. 

In this section, we state three axioms for the determinant function, and 
we assume the existence of a function satisfying these axioms. The actual 
construction of the general determinant function will be postponed to a later 
chapter. 

Definition. A function that assigns, to each n by n matrix A, a real 
number denoted det A, is called a determinant function if it satisfies the 
following axioms: 

( 1 ) If B is the matrix obtained by exchanging any two rows of A, then 
det B = — det A. 

(2) Given i, the function det A is linear as a function of the t th row alone. 

( 3 ) det I n = 1 . 

Condition (2) can be formulated as follows: Let i be fixed. Given an 
n-tuple x, let A^(x) denote the matrix obtained from A by replacing the * th 
row by x. Then condition (2) states that 

det A;(ax + by) = a det A, (x) + 6 det A,(y). 

These three axioms characterize the determinant function uniquely, as we 
shall see. 


EXAMPLE 1 . In low dimensions, it is easy to construct the determinant func- 
tion. For 1 by 1 matrices, the function 


det [a] = a 

will do. For 2 by 2 matrices, the function 


det 


a 

c 


b 

d 


= ad — be 


suffices. And for 3 by 3 matrices, the function 



[■<21 

a 2 

03 ' 

det 

61 

62 

63 


.Cl 

c 2 

C 3 . 


O162C3 -f- C&2 63 C \ -p (J361C2 

-(J362C1 - <216302 - a 2 bic 3 


will do, as you can readily check. For matrices of larger size, the definition 
is more complicated. For example, the expression for the determinant of a 4 
by 4 matrix involves 24 terms; and for a 5 by 5 matrix, it involves 120 terms! 
Obviously, a less direct approach is needed. We shall return to this matter in 
Chapter 6 . 


Using the axioms, one can determine how the elementary row operations 
affect the value of the determinant. One has the following result: 
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Theorem 2.6. Let A be an n by n matrix. 

(a) If E is the elementary matrix corresponding to the operation 
that exchanges rows i\ and i%, then det(E • A) = — det A. 

(b) If E' is the elementary matrix corresponding to the operation 
that replaces row i\ of A by itself plus c times row i 2 , then det^ 7 • A) = 
det A. 

(c) If E" is the elementary matrix corresponding to the operation 
that multiplies row i of A by the non- zero scalar A, then det (E ,r • A) = 
A(det A). 

(d) If A is the identity matrix I n , then det A = 1. 

Proof. Property (a) is a restatement of Axiom 1, and (d) is a restate- 
ment of Axiom 3. Property (c) follows directly from linearity (Axiom 2); it 
states merely that 

det Ai(Xx) = A(detAj(x)). 

Now we verify (b). Note first that if A has two equal rows, then det A = 0. 
For exchanging these rows does not change the matrix A, but by Axiom 1 it 
changes the sign of the determinant. Now let E' be the elementary operation 
that replaces row i = i\ by itself plus c times row i 2 . Let x equal row i 1 and 
let y equal row i 2 . We compute 

det(i?' • A) = det A, (x + cy) 

= det A,(x) + cdet A,(y) 

= det A,(x), since A t (y) has two equal rows, 

= det A, since A*(x) — A. □ 


The four properties of the determinant function stated in this theorem are 
what one usually uses in practice rather than the axioms themselves. They 
also characterize the determinant completely, as we shall see. 

One can use these properties to compute the determinants of the elemen- 
tary matrices. Setting A — I n in Theorem 2.6, we have 

det E = — 1 and det E' — 1 and deti?" = A. 

We shall see later how they can be used to compute the determinant in general. 

Now we derive the further properties of the determinant function that we 
shall need. 

Theorem 2.7. Let A be a square matrix. If the rows of A are 
independent, then det A / 0; if the rows are dependent, then det A = 0. 
Thus an n by n matrix A has rank n if and only if det A ^ 0. 
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Proof. First, we note that if the i th row of A is the zero row, then 
det A = 0. For multiplying row i by 2 leaves A unchanged; on the other 
hand, it must multiply the value of the determinant by 2. 

Second, we note that applying one of the elementary row operations to A 
does not affect the vanishing or non- vanishing of the determinant, for it alters 
the value of the determinant by a factor of either —1 or 1 or A (where A ^ 0). 

Now by means of elementary row operations, let us reduce A to a matrix B 
in echelon form. (Elementary operations of types (1) and (2) will suffice.) If 
the rows of A are dependent, rank A < n; then rank B < n, so that B must 
have a zero row. Then det B = 0, as just noted; it follows that det A = 0. 

If the rows of A are independent, let us reduce B further to echelon 
form C. Since C is square and has rank n, C must equal the identity ma- 
trix Then det C ^ 0; it follows that det A ^ 0. □ 

The proof just given can be refined so as to provide a method for calcu- 
lating the determinant function: 

Theorem 2.8. Given a square matrix A, let use reduce it to 
echelon form B by elementary row operations of types ( 1 ) and ( 2 ). If 
B has a zero row, then det A — 0. Otherwise , let k be the number of row 
exchanges involved in the reduction process. Then det A equals (— I) fc 
times the product of the diagonal entries of B. 

Proof. If B has a zero row, then rank A < n and det A = 0. So 
suppose that B has no zero row. We know from (a) and (b) of Theorem 2.6 
that det A = (— l) fc det B. Furthermore, B must have the form 



where the diagonal entries are non-zero. It remains to show that 

det B = 611^22 * * ‘ b nn . 

For that purpose, let us apply elementary operations of type (2) to make 
the entries above the diagonal into zeros. The diagonal entries are unaffected 
by the process; therefore the resulting matrix has the form 
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Since only operations of type (2) are involved, we have deti? = det C. 
Now let us multiply row 1 of C by l/&n, row 2 by lfb 22 , and s° on, obtaining 
as our end result the identity matrix Property (c) of Theorem 2.6 implies 
that 

det I n = (I/611 ) (I/622) • • • (1 /L) det C, 
so that (using property (d)) 

det C = bnb 2 2 • * • b n n , 


as desired. □ 

Corollary 2 . 9 . The determinant function is uniquely character- 
ized by its three axioms. It is also characterized by the four properties 
listed in Theorem 2 . 6 . 

Proof. The calculation of det A just given uses only properties (a)-(d) 
of Theorem 2 . 6 . These in turn follow from the three axioms. □ 

Theorem 2.10. Let A and B be n by n matrices. Then 
det(A • B ) = (det A) • (det B). 


Proof. Step 1 . The theorem holds when A is an elementary matrix. 
Indeed: 

det(£ • B) — — det B = (det E) (det B), 
det (E 1 • B) = det B = (det E ') (det B ), 
det(£" • B) = A * det B = (det E") (det B). 

Step 2 . The theorem holds when rank A — n. For in that case, A is 
a product of elementary matrices, and one merely applies Step 1 repeatedly. 
Specifically, if A — E\ • • • Ek, then 

det(A • B) = det(£'i •••£'* • B) 

= (det Ei) det(E 2 • • • E^ • B) 


= (det Ei) (det E 2 ) ■ • * (det Ek) (det B). 
This equation holds for all B. In the case B = it tells us that 
det A = (det £1) (det E 2 ) • • • (det Ek). 


The theorem follows. 
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Step 3. We complete the proof by showing that the theorem holds if 
rank A < n. We have in general, 

rank (A • B) = rank (A ■ B) tr = rank ( B tr • A tr ) < rank A tr , 

where the inequality follows from Step 1 of Theorem 2.3. Thus if rank A < n, 
the theorem holds because both sides of the equation vanish. □ 

Even in low dimensions, this theorem would be very unpleasant to prove 
by direct computation. You might try it in the 2 by 2 case! 

Theorem 2.11. det A tr = det A. 

Proof. Step 1. We show the theorem holds when A is an elementary 
matrix. 

Let E,E ' , and E " be elementary matrices of the three basic types. Direct 
inspection shows that E tr = E and (E") tv = E ", so the theorem is trivial 
in these cases. For the matrix E' of type (2), we note that its transpose is 
another elementary matrix of type (2), so that both have determinant 1. 

Step 2. We verify the theorem when A has rank n. In that case, A is a 
product of elementary matrices, say 

A — Ei ■ E 2 -- Ek. 

Then 

det A tr = det(££ r • ■ • E$ • E\ T ) 

= (det E)f ) • • • (det E^) (det E[ r ) by Theorem 2.10, 

= (det Et) • • • (det E 2 ) (det E{) by Step 1, 

= (det Ei) (det E 2 ) • • • (det EP) 

— det(£'i • E 2 • • • Ek ) 

— det A. 

Step 3. The theorem holds if rank A < n. In this case, rank A tr < n , 
so that det A tr = 0 = det A. □ 

A formula for A -1 

We know that A is invertible if and only if det A ^ 0. Now we derive a 
formula for A~ l that involves determinants explicitly. 

Definition. Let A be an n by n matrix. The matrix of size n — 1 by 
n — 1 that is obtained from A by deleting the i th row and the j th column of 
A is called the (i,j)-minor of A. It is denoted A,j. The number 

( — 1)‘ +J det Ay 

is called the (i,j)- cofactor of A. 
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Lemma 2.12. Let A be an n by n matrix; let b denote its entry 
in row i and column j. 

(a) If all the entries in row i other than b vanish , then 

det A = b(-iy +j det Aij . 

(b) The same equation holds if all the entries in column j other than 
the entry b vanish. 

Proof. Step 1. We verify a special case of the theorem. Let 6, 02 , . . . , a n 
be fixed numbers. Given an n — 1 by n — 1 matrix D, let A(D) denote the n 
by n matrix 

'b 
0 

A(D)= . 

.0 

We show that det A{D) — b(detD). 

If b = 0, this result is obvious, since in that case rank A(D) < n. So 
assume b / 0. Define a function / by the equation 

f(D) = (1/6) det A(D). 

We show that / satisfies the four properties stated in Theorem 2.6, so that 
f(D) = det D. 

Exchanging two rows of D has the effect of exchanging two rows of A(D ), 
which changes the value of / by a factor —1. Replacing row ij of D by itself 
plus c times row i% of D has the effect of replacing row (ii + 1) of A(D ) 
by itself plus row ( i 2 + 1) of A(D ), which leaves the value of / unchanged. 
Multiplying row i of D by A has the effect of multiplying row (i + 1) of A(D ) 
by A, which changes the value of / by a factor of A. Finally, if D = I n -i, 
then A(D) is in echelon form, so det A(D) = b • 1 ■ ■ • 1 by Theorem 2.8, and 

f(D) = 1. 

Step 2. It follows by taking transposes that 

T 6 0 ... 01 

= 6(det D). 


Step 3. We prove the theorem. Let A be a matrix satisfying the hy- 
potheses of our theorem. One can by a sequence of i— 1 exchanges of adjacent 
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rows bring the z th row of A up to the top of matrix, without affecting the 
order of the remaining rows. Then by a sequence of jr — 1 exchanges of adja- 
cent columns, one can bring the j th column of this matrix to the left edge of 
the matrix, without affecting the order of the remaining columns. The ma- 
trix C that results has the form of one of the matrices considered in Steps 1 
and 2. Furthermore, the (l,l)-minor C i,i of the matrix C is identical with 
the (z,jf)-minor A,j of the original matrix A. 

Now each row exchange changes the sign of the determinant. So does 
each column exchange, by Theorem 2.11. Therefore 


detC = (_l)('- 1 )+0- 1 )det A = (-l) <+ > det A. 


Thus 

det A = (— l) l+J det C, 

= (— 1) ,+J 6det Ci t i by Steps 1 and 2, 

= (~iy+ j bdet Aij. □ 

Theorem 2.13 (Cramer’s rule). Let A be an n by n matrix with 
successive columns ai , . . . , a n . Let 



~Xi ■ 


"Cl ' 

X — 

• 

and c = 

* 


- - 


- c„ . 


be column matrices. If A x = c, then 

(det A) • — det [ai • • • a,_i c a t -+i • • -a n ]. 


Proof. Let ei,...,e n be the standard basis for R n , where each e» is 
written as a column matrix. Let C be the matrix 

C = [ei • • e,.! x e i+1 • • -e n ]. 

The equations A • ej = a j and A-x — c imply that 

A-C -[ ai •••a*_i c a,- + i ■•■a n ]. 


By Theorem 2.10, 


(det A) • (det C ) = det [ai • • • a,_! c a l+ i ■ • • a n ]. 
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‘1 . 

. Xi . 

.. o- 

c = 

0 . 

. x, . 

.. 0 


.0 . 

. x„ . 

. 1. 


where the entry x,- appears in row i and column i. Hence by the preceding 
lemma, 

det C ~ x,(— 1) ,+ * det i = x,-. 


The theorem follows. □ 


Here now is the formula we have been seeking: 


Theorem 2.14. 

Then 


Let A be an n by n matrix of rank n; let B = A~ l . 



(— l) J+t det Ajj 
det A 


Proof. Let j be fixed throughout this argument. Let 


x = 


"*i ' 
-X n - 


denote the j th column of the matrix B . The fact that A • B = I n implies in 
particular that A • x = e ; . Cramer’s rule tells us that 

(det A) • — det [ai • • • a,_i e^ a, + i • • ■ a n ]. 


We conclude from Lemma 2.12 that 


(det A) • Xi — 1 • (— l) J+l det Aji . 


Since Xj = 6, ; - , our theorem follows. □ 


This theorem gives us an algorithm for computing the inverse of a ma- 
trix A. One proceeds as follows: 

(1) First, form the matrix whose entry in row i and column j is 
(— l)**- 7 det A{j ; this matrix is called the matrix of cofactors of A. 

(2) Second, take the transpose of this matrix. 

(3) Third, divide each entry of this matrix by det A. 
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This algorithm is in fact not very useful for practical purposes; computing 
determinants is simply too time-consuming. The importance of this formula 
for A -1 is theoretical, as we shall see. If one wishes actually to compute A -1 , 
there is an algorithm based on Gauss-Jordan reduction that is more efficient. 
It is outlined in the exercises. 


Expansion by cofactors 

We now derive a final formula for evaluating the determinant. This is the 
one place we actually need the axioms for the determinant function rather 
than the properties stated in Theorem 2.6. 


Theorem 2 . 15 . Let A be an n by n matrix. Let i be fixed. Then 


det A = • det A,-*. 

k-l 


Here Aik is, as usual, the (i,fc)-minor of A. This rule is called the “rule 
for expansion of the determinant by cofactors of the i th row.” There is a 
similar rule for expansion by cofactors of the j th column, proved by taking 
transposes. 


Proof. Let A;(x), as usual, denote the matrix obtained from A by re- 
placing the i th row by the n-tuple x. If ei,...,e„ denote the usual basis 
vectors in R n (written as row matrices in this case), then the i th row of A 
can be written in the form 

n 

Y a-nc^k- 

fc=i 


Then 


n 

det A = Oik • det Ai(ejt) 

Jt=i 


n 

= £>*(- 1)‘ + * det Ait 


by linearity (Axiom 2), 


by Lemma 2.12. □ 
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EXERCISES 


1. Consider the matrix 


'1 2 
A = 1 -1 

0 1 


(a) Find two different left inverses for A. 

(b) Show that A has no right inverse. 

2. Let A be an n by m matrix with n ^ m. 

(a) If rank A = m, show there exists a matrix D that is a product of 
elementary matrices such that 


D A 


(b) Show that A has a left inverse if and only if rank A = m. 

(c) Show that A has a right inverse if and only if rank A = n. 

3 . Verify that the functions defined in Example 1 satisfy the axioms for the 
determinant function. 

4 . (a) Let A be an n by n matrix of rank n. By applying elementary row 

operations to A, one can reduce A to the identity matrix. Show 
that by applying the same operations, in the same order, to I n , one 
obtains the matrix A~ l . 


(b) Let 


A - 0 1 


Calculate A -1 by using the algorithm suggested in (a). [Hint: An 
easy way to do this is to reduce the 3 by 6 matrix [A J3] to reduced 
echelon form.] 

(c) Calculate A -1 using the formula involving determinants. 

5 . Let 

fa b 1 


where ad — be ^ 0. Find A 1 . 

*6. Prove the following: 

Theorem. Let A be a k by k matrix, let D have size n by n and 
let C have size n by k. Then 


(det A) • (det D ). 
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Proof. First show that 


[A ol 


h 

o ‘ 


A o ' 

►4 

O 

1 


C 

D 


C D 


Then use Lemma 2.12. 


§3. REVIEW OF TOPOLOGY IN R n 


Metric spaces 

Recall that if A and B are sets, then Ax B denotes the set of all ordered 
pairs (a, b) for which a 6 A and b £ B. 

Given a set X , a metric on X is a function d:X x X — * R such that 
the following properties hold for all x,y,z £ X: 

(1) d(x,y) = d(y,x). 

(2) d(x,y) > 0, and equality holds if and only if x = y. 

(3) d(x,z) < d(x,y) + d(y,z). 

A metric space is a set X together with a specific metric on X . We often 
suppress mention of the metric, and speak simply of “the metric space X.” 

If X is a metric space with metric d , and if Y is a subset of X, then the 
restriction of d to the set Y x Y is a metric on Y ; thus Y is a metric space 
in its own right. It is called a subspace of X. 

For example, R n has the metrics 

tf(x,y) = ||x- y|| and d(x, y) = | x - y | ; 

they are called the euclidean metric and the sup metric, respectively. It 
follows immediately from the properties of a norm that they are metrics. For 
many purposes, these two metrics on R n are equivalent, as we shall see. 

We shall in this book be concerned only with the metric space R n and 
its subspaces, except for the expository final section, in which we deal with 
general metric spaces. The space R n is commonly called n-dimensional 
euclidean space. 

If X is a metric space with metric d , then given Xq G X and given € > 0, 
the set 

U(x 0 -,e) = {x\d(x,x 0 ) < e} 
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is called the 6-neighborhood of Xq ) or the 6-neighborhood centered at 
Xq. A subset U of X is said to be open in X if for each Xq € U there is a 
corresponding 6 > 0 such that U(x 0 -,e) is contained in U . A subset C of X 
is said to be closed in X if its complement X — C is open in X . It follows 
from the triangle inequality that an 6-neighborhood is itself an open set. 

UU is any open set containing Xq, we commonly refer to U simply as a 
neighborhood of Xq. 

Theorem 3.1. Let (X, d) be a metric space. Then finite intersec- 
tions and arbitrary unions of open sets of X are open in X. Similarly , 
finite unions and arbitrary intersections of closed sets of X are closed 
in X. □ 

Theorem 3.2. Let X be a metric space; let Y be a subspace. A 
subset A of Y is open in Y if and only if it has the form 

A = U(lY , 

where U is open in X. Similarly, a subset A of Y is closed in Y if and 
only if it has the form 

A = CnY , 

where C is closed in X. □ 

It follows that if A is open in Y and Y is open in X , then A is open in 
X. Similarly, if A is closed in Y and Y is closed in X , then A is closed in X . 

If X is a metric space, a point Xo of X is said to be a limit point 
of the subset A of X if every 6-neighborhood of Xo intersects A in at least 
one point different from Xq. An equivalent condition is to require that every 
neighborhood of x 0 contain infinitely many points of A. 

Theorem 3.3. If A is a subset of X, then the set A consisting 
of A and all its limit points is a closed set of X. A subset of X is closed 
if and only if it contains all its limit points. □ 

The set A is called the closure of A. 

In R n , the 6-neighborhoods in our two standard metrics are given special 
names. If a € R a , the 6-neighborhood of a in the euclidean metric is called the 
open ball of radius 6 centered at a, and denoted B( a; e). The 6-neighborhood 
of a in the sup metric is called the open cube of radius 6 centered at a, and 
denoted C(a; 6). The inequalities |x| < ||x|| < v^l x l lead to the following 
inclusions: 

B(a; 6) C C( a; e) C B(a; y/ne). 

These inclusions in turn imply the following: 
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Theorem 3.4. If X is a subspace of R n , the collection of open 
sets of X is the same whether one uses the euclidean metric or the sup 
metric on X. The same is true for the collection of closed sets of X. □ 

In general, any property of a metric space X that depends only on the 
collection of open sets of X , rather than on the specific metric involved, is 
called a topological property of X. Limits, continuity, and compactness 
are examples of such, as we shall see. 

Limits and Continuity 

Let X and Y be metric spaces, with metrics dx and dy , respectively. 

We say that a function f :X —> Y is continuous at the point £o of X 
if for each open set V of Y containing f(x o), there is an open set U of X 
containing #0 such that f(U) C V . We say f is continuous if it is continuous 
at each point Xq of X . Continuity of / is equivalent to the requirement that 
for each open set V of Y, the set 

/- 1 (V) = {*|/(x)e V) 

is open in X , or alternatively, the requirement that for each closed set D 
of Y, the set f~ 1 (D) is closed in X . 

Continuity may be formulated in a way that involves the metrics specif- 
ically. The function / is continuous at Xq if and only if the following holds: 
For each e > 0, there is a corresponding 8 > 0 such that 

^y(f(x),f(x 0 )) < e whenever dx(x,x 0 )<8. 

This is the classical “e-8 formulation of continuity.” 

Note that given Xq € X it may happen that for some 8 > 0, the 8- 
neighborhood of £0 consists of the point £0 alone. In that case, £0 is called an 
isolated point of X, and any function / : X — + Y is automatically continuous 
at £ 0 ! 

A constant function from X to Y is continuous, and so is the identity 
function ix : X — * X. So are restrictions and composites of continuous func- 
tions: 

Theorem 3.5. (a) Let x 0 6 A, where A is a subspace of X. If 

f :X —>Y is continuous at x 0 , then the restricted function f | A : A — * Y 
is continuous at x 0 . 

(b) Let f :X — ► Y and g:Y — + Z. If f is continuous at x Q and g is 
continuous at yo = f(x 0 ), then g o f :X —+ Z is continuous at x 0 . □ 


Theorem 3.6. (a) Let X be a metric space . Let f :X — ► R n have 

the form 

/(*) = (/l(*)> •••»/»(*)). 
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Then f is continuous at Xq if and only if each function fi'.X — > R is 
continuous at x 0 . The functions fi are called the component functions 
off 

(b) Let f,g:X^Rbe continuous at x 0 . Then f + g and f - g and 
f • g are continuous at xq; and f/g is continuous at xq if g(x o) ^ 0. 

(c) The projection function 7T* :R n — > R given by 7T,(x) = X{ is con- 
tinuous. □ 

These theorems imply that functions formed from the familiar real- valued 
continuous functions of calculus, using algebraic operations and composites, 
are continuous in R”. For instance, since one knows that the functions e x and 
sin x are continuous in R, it follows that such a function as 

f(s , t, u , v) = (sin(s + t))/ e uv 


is continuous in R 4 . 

Now we define the notion of limit. Let X be a metric space. Let A C X 
and let / : A — ► Y . Let x Q be a limit point of the domain A of /. (The point 
Xq may or may not belong to A.) We say that f(x) approaches y 0 as x 
approaches Xq if for each open set V of Y containing y 0 , there is an open 
set U of X containing x 0 such that f(x) € V whenever x is in U fl A and 
x Xq. This statement is expressed symbolically in the form 

f(x) -> y Q as x -+ Xq. 

We also say in this situation that the limit of f(x), as X approaches Xq, is 
y 0 . This statement is expressed symbolically by the equation 

lim f(x) = y 0 . 

X X Q 


Note that the requirement that Xq be a limit point of A guarantees that 
there exist points x different from Xq belonging to the set U C\A. We do not 
attempt to define the limit of f if Xq is not a limit point of the domain 
off. 

Note also that the value of / at x 0 (provided / is even defined at a?o) is 
not involved in the definition of the limit. 

The notion of limit can be formulated in a way that involves the metrics 
specifically. One shows readily that f(x) approaches yo as x approaches Xq 
if and only if the following condition holds: For each € > 0, there is a 
corresponding 6 > 0 such that 

d Y (f(x), y 0 ) < e whenever x G A and 0 < d x (x,x 0 ) < 6. 

There is a direct relation between limits and continuity; it is the following: 
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Theorem 3.7. Let f : X — ► Y . If x 0 is an isolated point of X, 
then f is continuous at x 0 . Otherwise, f is continuous at x 0 if and only 
if f(x) — ► f(xo) as x -> x 0 . □ 

Most of the theorems dealing with continuity have counterparts that deal 
with limits: 

Theorem 3.8. (a) Let A C X ; let f : A — ► R" have the form 

/(*) = (/i (*).•• •./«(*))• 

Let a = (ai, . . . , a n ). Then f(x) — * a as x — ► x 0 if and only if fi(x) — + ai 
as x — ► Xo, for each i. 

(b) Let f,g:A—>R. If f(x) — ► a and g(x ) —> b as x —> xq, then as 
x -► x 0 , 

f(x) + < 7 (x) a + b, 
f(x) ~ g(x) a — b, 
f(x) • g(x) -> a • 6; 

a/so, f{x)/g(x) — > a/b if b ^ 0. □ 

Interior and Exterior 

The following concepts make sense in an arbitrary metric space. Since we 
shall use them only for R n , we define them only in that case. 

Definition. Let A be a subset of R n . The interior of A, as a subset of 
R n , is defined to be the union of all open sets of R n that are contained in A\ 
it is denoted Int A. The exterior of A is defined to be the union of all open 
sets of R n that are disjoint from A; it is denoted Ext A. The boundary of A 
consists of those points of R n that belong neither to Int A nor to Ext A\ it is 
denoted Bd A. 

A point x is in Bd A if and only if every open set containing x intersects 
both A and the complement R n — A of A. The space R n is the union of the 
disjoint sets Int A , Ext A , and Bd A\ the first two are open in R n and the 
third is closed in R n . 

For example, suppose Q is the rectangle 


Q — [®i 5 Zh] X ■ ■ ■ x [a n , fen]? 


consisting of all points x of R n such that a, < X{ < fe t - for all i. You can check 
that 

Int Q = (a u b x ) x • x (a n ,fe„). 
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We often call Int Q an open rectangle. Furthermore, Ext Q = R n — Q and 
Bd Q = Q -Int Q. 

An open cube is a special case of an open rectangle; indeed, 

C(a; e) = (ai - ai + e) x • • • x (a„ - e, a n + e). 


The corresponding (closed) rectangle 


C - [«! - e, a x + e] x • • • x [a n - e, a n + e] 


is often called a closed cube, or simply a cube, centered at a. 

EXERCISES 

Throughout, let AT be a metric space with metric d. 

1. Show that U(x 0 ]€) is an open set. 

2. Let Y C X. Give an example where A is open in Y but not open in X. 
Give an example where A is closed in Y but not closed in X. 

3. Let A C X._Show that if C is a closed set of X and C contains A, then 
C contains A. 

4. (a) Show that if Q is a rectangle, then Q equals the closure of Int Q. 

(b) If D is a closed set, what is the relation in general between the set D 
and the closure of Int Dl 

(c) If U is an open set,_what is the relation in general between the set U 
and the interior of Ul 

5. Let / : X — ► Y . Show that / is continuous if and only if for each x £ X 
there is a neighborhood U of x such that / | U is continuous. 

6. Let X = A U B, where A and B are subspaces of X. Let / : X — * Y; 
suppose that the restricted functions 

/|A:A-Y and f\B:B->Y 

are continuous. Show that if both A and B are closed in X , then / is 
continuous. 

7. Finding the limit of a composite function gofis easy if both / and g are 
continuous; see Theorem 3.5. Otherwise, it can be a bit tricky: 

Let / :X — ► Y and g : Y —+ Z. Let Xo be a limit point of X and let 
t/o be a limit point of Y. See Figure 3.1. Consider the following three 
conditions: 

(i) f(x) -+ t/o as x xo. 

(ii) 9{y) — 2o as y — y 0 . 

(iii) g(f(x)) —> z 0 as x — x 0 . 

(a) Give an example where (i) and (ii) hold, but (iii) does not. 

(b) Show that if (i) and (ii) hold and if g(yo ) = Zo, then (iii) holds. 
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8 . 

9. 


Let /:R — ► R be defined by setting f(x) = sin a: if x is rational, and 
/( x) = 0 otherwise. At what points is / continuous? 

If we denote the general point of R 2 by (x, y), determine Int A, Ext A, and 
Bd A for the subset A of R 2 specified by each of the following conditions: 

(a) x = 0. (e) x and y are rational. 

(b) 0 < x < 1 . (f) 0 < x 2 + y 2 < 1 . 

(c) 0 < x < 1 and 0 < y < 1. (g) y < x 2 . 

(d) x is rational and y > 0. (h) y < x 2 . 



Figure 3.1 
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An important class of subspaces of R" is the class of compact spaces. We shall 
use the basic properties of such spaces constantly. The properties we shall 
need are summarized in the theorems of this section. Proofs are included, 
since some of these results you may not have seen before. 

A second useful class of spaces is the class of connected spaces; we sum- 
marize here those few properties we shall need. 

We do not attempt to deal here with compactness and connectedness in 
arbitrary metric spaces, but comment that many of our proofs do hold in that 
more general situation. 

Compact spaces 

Definition. Let X be a subspace of R n . A covering of X is a collection 
of subsets of R n whose union contains X ; if each of the subsets is open in R n , 
it is called an open covering of X. The space X is said to be compact if 
every open covering of X contains a finite sub collection that also forms an 
open covering of X . 

While this definition of compactness involves open sets of R n , it can be 
reformulated in a manner that involves only open sets of the space X: 

Theorem 4.1. A subspace X of R n is compact if and only if for 
every collection of sets open in X whose union is X, there is a finite 
subcollection whose union equals X. 

Proof. Suppose X is compact. Let {A a } be a collection of sets open 
in X whose union is X . Choose, for each O’, an open set U a of R n such 
that A a = U a n X . Since X is compact, some finite subcollection of the 
collection {U a } covers X, say for a - Oi,...,o*. Then the sets A a , for 
a = Qi, . . . , Ofc, have X as their union. 

The proof of the converse is similar. □ 

The following result is always proved in a first course in analysis, so the 
proof will be omitted here: 

Theorem 4.2. The subspace [a, 6] of R is compact. □ 

Definition. A subspace X of R n is said to be bounded if there is an 
M such that | x | < M for all x E X. 

We shall eventually show that a subspace of R n is compact if and only if 
it is closed and bounded. Half of that theorem is easy; we prove it now: 
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Theorem 4.3. If X is a compact subspace of R n , then X is closed 
and bounded. 

Proof. Step 1. We show that X is bounded. For each positive inte- 
ger N, let Us denote the open cube Us = C(0\N). Then Us is an open 
set; and U\ C U? C • * •; and the sets Us cover all of R n (so in particular they 
cover X). Some finite sub collection also covers X, say for N = . . . ,Nk- 

If M is the largest of the numbers N i, . . . , Nk , then X is contained in Um\ 
thus X is bounded. 

Step 2. We show that X is closed by showing that the complement of 
X is open. Let a be a point of R n not in X ; we find an e-neighborhood of a 
that lies in the complement of X . 

For each positive integer N , consider the cube 

Cs = {x; | x — a | < l/N}. 

Then C\ D C 2 D • • •, and the intersection of the sets Cs consists of the 
point a alone. Let Vs be the complement of Cs', then Vs is an open set; and 
Vi C V 2 C • • •; and the sets Vs cover all of R n except for the point a (so they 
cover X). Some finite sub collection covers X, say for N = N 1 , . . . , Nk- If M 
is the largest of the numbers N 1 , . . . , N then X is contained in Vm • Then 
the set Cm ls disjoint from X, so that in particular the open cube C{ a; 1 j M) 
lies in the complement of X. See Figure 4.1. □ 


X 


Figure 4-1 

Corollary 4.4. Let X be a compact subspace of R. Then X has a 
largest element and a smallest element. 
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Proof. Since X is bounded, it has a greatest lower bound and a least 
upper bound. Since X is closed, these elements must belong to X . □ 

Here is a basic (and familiar) result that is used constantly: 

Theorem 4.5 (Extreme- value theorem). Let X be a compact 
subspace of R m . If f : X — ► R” is continuous, then f(X) is a compact 
subspace of R n . 

In particular, if (j> : X —* R is continuous, then has a maximum 
value and a minimum value. 

Proof. Let {V a } be a collection of open sets of R n that covers f(X). 
The sets / -1 (V a ) form an open covering of X. Hence some finitely many of 
them cover X , say for a — Gq, . . . , a*. Then the sets V a for a = oq,. . . , oq 
cover f(X). Thus f(X) is compact. 

Now if <f> : X — *> R is continuous, <j>(X) is compact, so it has a largest 
element and a smallest element. These are the maximum and minimum values 
of 0. □ 

Now we prove a result that may not be so familiar. 

Definition. Let X be a subset of R n . Given € > 0, the union of the 
sets B( a;e), as a ranges over all points of X, is called the e-neighborhood 
of X in the euclidean metric. Similarly, the union of the sets C(a; e) is called 
the e-neighborhood of X in the sup metric. 

Theorem 4.6 (The e-neighborhood theorem). Let X be a com- 
pact subspace of R"; let U be an open set of R n containing X. Then 
there is an e > 0 such that the e-neighborhood of X (in either metric) 
is contained in U. 

Proof. The e-neighborhood of X in the euclidean metric is contained in 
the e-neighborhood of X in the sup metric. Therefore it suffices to deal only 
with the latter case. 

Step 1. Let C be a fixed subset of R n . For each x E R n , we define 
d(x, C) ~ inf { | x — c | ; c 6 C}. 

We call d(x,C) the distance from x to C. We show it is continuous as a 
function of x: 

Let c (E C] let x, y £ R”. The triangle inequality implies that 
d(x,C)~ | x — y | < | x c | - | x — y | < | y — c |. 
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This inequality holds for all c E C; therefore 


d(x,C) - |x - y | < d(y,C), 


so that 

c?(x, C) - d(y, C) < | x — y |. 

The same inequality holds if x and y are interchanged; continuity of d(x, C) 
follows. 

Step 2. We prove the theorem. Given [/, define / : X — * R by the 
equation 

f(x) = d{x,R"-U). 

Then / is a continuous function. Furthermore, /(x) > 0 for all x E X. For if 
x E X , then some ^-neighborhood of x is contained in U , whence /(x) > 6. 
Because X is compact, / has a minimum value 6. Because / takes on only 
positive values, this minimum value is positive. Then the e-neighborhood 
of X is contained in U . □ 

This theorem does not hold without some hypothesis on the set X . If X 
is the £-axis in R 2 , for example, and U is the open set 

U = {(x,y)\y 2 < 1/(1 + x 2 ) }, 

then there is no e such that the e-neighborhood of X is contained in U . See 



Figure 4-% 


Here is another familiar result. 

Theorem 4.7 (Uniform continuity). Let X be a compact subspace 
of R m ; let f : X — ► R n be continuous. Given e > 0, there is a S > 0 
such that whenever x,y E X, 


x — y| <6 implies |/(x)-/(y)| < e. 
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This result also holds if one uses the euclidean metric instead of the 
sup metric. 

The condition stated in the conclusion of the theorem is called the con- 
dition of uniform continuity. 

Proof Consider the subspace X x X of R m x R m ; and within this, 
consider the space 

A = {(x,x)|x EX }, 

which is called the diagonal of X x X . The diagonal is a compact subspace 
of R 2m , since it is the image of the compact space X under the continuous 
map /(x) = (x, x). 

We prove the theorem first for the euclidean metric. Consider the function 
g : X x X — * R defined by the equation 

<7(x,y)= || /(x) — /(y) || . 

Then consider the set of points (x, y) of X x X for which <?(x,y) < e. 
Because g is continuous, this set is an open set of X x X. Also, it contains 
the diagonal A, since #(x,x) = 0. Therefore, it equals the intersection with 
X x X of an open set U of R m x R m that contains A. See Figure 4.3. 



Figure 4-3 


Compactness of A implies that for some 6, the ^-neighborhood of A is 
contained in U . This is the S required by our theorem. For if x,y 6 X with 
|| x — y || <6, then 

ll(x,y)-(y,y)H = || (x - y, 0) || = ||x-y||<«, 


so that (x,y) belongs to the ^-neighborhood of the diagonal A. Then (x,y) 
belongs to U, so that y(x,y) < 6, as desired. 
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The corresponding result for the sup metric can be derived by a similar 
proof, or simply by noting that if | x — y | < S/y/n, then || x — y || < 6, whence 

I /(x) - /( y) I < || /(X) - /( y) II < e. □ 

To complete our characterization of the compact subspaces of R n , we need 
the following lemma: 

Lemma 4.8. The rectangle 

Q = [ai,&i] x • • • x [a n ,b n ] 

in R n is compact . 

Proof. We proceed by induction on n. The lemma is true for n = 1; we 
suppose it true for n — 1 and prove it true for n. We can write 

Q = X x [<i n , 6 n ] , 

where X is a rectangle in R" -1 . Then X is compact by the induction hy- 
pothesis. Let A be an open covering of Q. 

Step 1. We show that given t € [a ny b n ], there is an e > 0 such that the 
set 

X x (t - e, t + e) 

can be covered by finitely many elements of A. 

The set X xt is a compact subspace of R", for it is the image of X under 
the continuous map / : X — > R" given by /(x) = (x,/). Therefore it may be 
covered by finitely many elements of A, say by A\ y . . . , Ak- 

Let U be the union of these sets; then U is open and contains X xt. See 
Figure 4.4. 



Because X xt is compact, there is an € > 0 such that the e-neighborhood 
of X x t is contained in U. Then in particular, the set X x (t — e, t + e) is 
contained in U, and hence is covered by A \ , . . . , Ak- 
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Step 2. By the result of Step 1, we may for each t 6 [fl n ^n] choose an 
open interval V t about t , such that the set X x V* can be covered by finitely 
many elements of the collection A. 

Now the open intervals V% in R cover the interval [fl n ,t n ]; hence finitely 
many of them cover this interval, say for t = t\, ... ,t m . 

Then Q — X x [a n ,b n ] is contained in the union of the sets X x Vt 
for t = t\ , . . . , t r n ; since each of these sets can be covered by finitely many 
elements of A, so may Q be covered. □ 

Theorem 4.9. If X is a closed and bounded subspace of R", then 
X is compact. 

Proof. Let A be a collection of open sets that covers X . Let us adjoin 
to this collection the single set R n — X, which is open in R n because X is 
closed. Then we have an open covering of all of R n . Because X is bounded, 
we can choose a rectangle Q that contains X\ our collection then in particular 
covers Q. 

Since Q is compact, some finite subcollection covers Q. If this finite 
sub collection contains the set R n — X, we discard it from the collection. We 
then have a finite sub collection of the collection A; it may not cover all of Q, 
but it certainly covers X, since the set R n — X we discarded contains no point 
of X. □ 

All the theorems of this section hold if R n and R m are replaced by ar- 
bitrary metric spaces, except for the theorem just proved. That theorem 
does not hold in an arbitrary metric space; see the exercises. 

Connected spaces 

If X is a metric space, then X is said to be connected if X cannot be 
written as the union of two disjoint non-empty sets A and B , each of which 
is open in X . 

The following theorem is always proved in a first course in analysis, so 
the proof will be omitted here: 

Theorem 4.10. The closed interval [a, b] of R n is connected. □ 

The basic fact about connected spaces that we shall use is the following: 

Theorem 4.11 (Intermediate- value theorem). Let X be con- 
nected. iff : X — ► Y is continuous, then f(X) is a connected subspace 
of Y. 

In particular, if <f> : X — + R is continuous and if f(x o) < r < f(x i) 
for some points x 0 ,x\ of X, then f(x) = r for some point x of X. 
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Proof. Suppose f(X ) — A U B, where A and B are disjoint sets open 
in f(X). Then f~ 1 (A) and f~ l (B) are disjoint sets whose union is X , and 
each is open in X because / is continuous. This contradicts connectedness 
of X . 

Given <j> , let A consist of all y in R with y < r, and let B consist of all y 
with y > r. Then A and B are open in R; if the set f(X) does not contain r, 
then f(X) is the union of the disjoint sets f(X) fl A and f(X) fl B , each of 
which is open in f(X). This contradicts connectedness of f(X). D 

If a and b are points of R", then the line segment joining a and b is 
defined to be the set of all points x of the form x = a + t(b — a), where 
0 < t < 1. Any line segment is connected, for it is the image of the interval 
[0, 1] under the continuous map t — ► a + /(b — a). 

A subset A of R n is said to be convex if for every pair a,b of points of 
A, the line segment joining a and b is contained in A. Any convex subset A 
of R n is automatically connected: For if A is the union of the disjoint sets U 
and V, each of which is open in A, we need merely choose a in U and b in 
V, and note that if L is the line segment joining a and b, then the sets U fl L 
and V D L are disjoint, non-empty, and open in L. 

It follows that in R n all open balls and open cubes and rectangles are 
connected. (See the exercises.) 

EXERCISES 

1. Let R + denote the set of positive real numbers. 

(a) Show that the continuous function / : R+ - R given by f(x) = 

is bounded but has neither a maximum value nor a minimum 

value. 

(b) Show that the continuous function g : R + — »■ R given by g(x) = 
sin(l/x) is bounded but does not satisfy the condition of uniform 
continuity on R + . 

2. Let X denote the subset (—1,1) x 0 of R 2 , and let U be the open ball 
B{ 0; 1) in R 2 , which contains X . Show there is no € > 0 such that the 
^-neighborhood of X in R” is contained in U. 

3. Let R°° be the set of all “infinite-tuples” x = (jEj, X 2 , . . .) of real numbers 
that end in an infinite string of 0’s. (See the exercises of § 1.) Define 
an inner product on R°° by the rule (x, y) — £:E;t/,. (This is a finite 
sum, since all but finitely many terms vanish.) Let || x — y || be the 
corresponding metric on R°°. Define 

e, = (0, ..., 0,1,0, ...,0,...), 

where 1 appears in the z‘ h place. Then the e, form a basis for R°°. Let X 
be the set of all the points e,. Show that X is closed, bounded, and 
non-compact. 
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4. (a) Show that open balls and open cubes in R"are convex. 

(b) Show that (open and closed) rectangles in R n are convex. 



Differentiation 


In this chapter, we consider functions mapping R m into R n , and we define 
what we mean by the derivative of such a function. Much of our discussion 
will simply generalize facts that are already familiar to you from calculus. 

The two major results of this chapter are the inverse function theorem , 
which gives conditions under which a differentiable function from R n to R" has 
a differentiable inverse, and the implicit function theorem f which provides 
the theoretical underpinning for the technique of implicit differentiation as 
studied in calculus. 

Recall that we write the elements of R m and R n as column matrices unless 
specifically stated otherwise. 


§5. THE DERIVATIVE 


First, let us recall how the derivative of a real- valued function of a real variable 
is defined. 

Let A be a subset of R; let (f> : A — ► R. Suppose A contains a neighbor- 
hood of the point a. We define the derivative of <j> at a by the equation 


6' (a) = lim 
v *-+o 


4>(a + t)~ <j>(a) 


provided the limit exists. In this case, we say that <f> is differentiable at a. 
The following facts are an immediate consequence: 
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(1) Differentiable functions are continuous. 

(2) Composites of differentiable functions are differentiable. 

We seek now to define the derivative of a function / mapping a subset of 
R m into R n . We cannot simply replace a and t in the definition just given by 
points of R m , for we cannot divide a point of R n by a point of R m if m > 1! 
Here is a first attempt at a definition: 


Definition. Let A C R m ; let / : A — ► R n . Suppose A contains a 
neighborhood of a. Given u € R m with u/0, define 


/'(a; u) = lim 
v ' t-*o 


/(a + tu) ~ /(a) 


provided the limit exists. This limit depends both on a and on u; it is called 
the directional derivative of / at a with respect to the vector u. (In 
calculus, one usually requires u to be a unit vector, but that is not necessary.) 


EXAMPLE 1. Let / : R 2 — ► R be given by the equation 


f(x i,x 2 ) = XiX 2 . 


The directional derivative of / at a = (ai,a 2 ) with respect to the vector 
u = (1, 0) is 


/'(a; u) = lim 
t— o 


(ui + t)a 2 — aia 2 
t 


= rx 2 . 


With respect to the vector v = (1,2), the directional derivative is 

v (Oi -1- t) (<22 + 2t) — CL 1 CL 2 

/ (a; v) = lim v ;v . L = a 2 + 2ai . 

t— 0 t 


It is tempting to believe that the “directional derivative” is the appropri- 
ate generalization of the notion of “derivative,” and to say that / is differen- 
tiable at a if /'( a; 11 ) exists for every u ^ 0. This would not, however, be a 
very useful definition of differentiability. It would not follow, for instance, that 
differentiability implies continuity. (See Example 3 following.) Nor would it 
follow that composites of differentiable functions are differentiable. (See the 
exercises of § 7.) So we seek something stronger. 

In order to motivate our eventual definition, let us reformulate the defi- 
nition of differentiability in the single-variable case as follows: 

Let A be a subset of R; let <f> : A — * R. Suppose A contains a neighbor- 
hood of a. We say that 0 is differentiable at a if there is a number A such 
that 

<t>(a + t) - <f>(a) - A t 


t 


0 as t —*■ 0. 
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The number A, which is unique, is called the derivative of <p at a, and denoted 

4 >'« 

This formulation of the definition makes explicit the fact that if (f) is differ- 
entiable, then the linear function A t is a good approximation to the “increment 
function” <f>(a -f t) — 4>{a)\ we often call A t the “first-order approximation” or 
the “linear approximation” to the increment function. 

Let us generalize this version of the definition. If A C R m and if / : A —*■ 
R”, what might we mean by a “first-order” or “linear” approximation to the 
increment function /(a -f h) — /(a)? The natural thing to do is to take a 
function that is linear in the sense of linear algebra. This idea leads to the 
following definition: 

Definition. Let A C R m , let / : A — ► R”. Suppose A contains a 
neighborhood of a. We say that / is differentiable at a if there is an n by 
m matrix B such that 

/(a + h) — / (a) — B ‘h. n _ n 

J ► 0 as h -+ 0. 

|h| 

The matrix B , which is unique, is called the derivative of / at a; it is denoted 
D/(a). 

Note that the quotient of which we are taking the limit is defined for h 
in some deleted neighborhood of 0, since the domain of / contains a neigh- 
borhood of a. Use of the sup norm in the denominator is not essential; one 
obtains an equivalent definition if one replaces |h| by ||h||. 

It is easy to see that B is unique. Suppose C is another matrix satisfying 
this condition. Subtracting, we have 

(g--g)-h 

|h| 

as h — ► 0. Let u be a fixed vector; set h = 2u; let t — > 0. It follows that 
(C — B) • u = 0. Since u is arbitrary, C = B. 

EXAMPLE 2. Let / : R m — ► R" be defined by the equation 

/(x) = B • x + b, 

where B is an n by m matrix, and b € R n . Then / is differentiable and 
Z)/(x) = B. Indeed, since 

/(a + h) - /(a) = £ • h, 


the quotient used in defining the derivative vanishes identically. 
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We now show that this definition is stronger than the tentative one we 
gave earlier, and that it is indeed a “suitable” definition of differentiability. 
Specifically, we verify the following facts, in this section and those following: 

(1) Differentiable functions are continuous. 

(2) Composites of differentiable functions are differentiable. 

(3) Differentiability of / at a implies the existence of all the directional 
derivatives of / at a. 

We also show how to compute the derivative when it exists. 

Theorem 5.1. Let A C R m ; let f : A -> R n . If f is differentiable 
at a, then all the directional derivatives of f at a exist, and 

/'( a;u) = Df( a) u. 


Proof. Let B — Df( a). Set li = tu in the definition of differentiability, 
where t / 0. Then by hypothesis, 


w 


/(a -f tu) — /(a) — B tu 
| hi | 


as t — ► 0. If t approaches 0 through positive values, we multiply (*) by |u| to 
conclude that 

/(a + tu) - /(a) _ B 
t 

as t — ► 0, as desired. If t approaches 0 through negative values, we multiply 
(*) by — |u| to reach the same conclusion. Thus /'(a;u) = B • u. □ 


EXAMPLE 3. Define / : R 2 — ► R by setting /( 0) =0 and 

f{x,y) = x 2 y/(x A + y 2 ) if (x,y)^ 0. 

We show all directional derivatives of / exist at 0, but that f is not differen- 
tiable at 0. Let u^O. Then 


f(0 + tu) - /( 0) 
t 


(th) 2 (tk) 1 . h 

(th)* + (tk) 2 t [fc 

h 2 k 

t 2 h A +/c 2 ’ 


so that 


/'(0;u) = ( fi2 A' i f MO, 

1 y ’ 1 0 if k = 0. 
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Thus /'( 0; u) exists for all u^O. However, the function / is not differentiable 
at 0. For if g : R 2 — * R is a function that is differentiable at 0, then Dg( 0) is 
a 1 by 2 matrix of the form [a 6], and 

<7'(0; u) = ah + bk , 

which is a linear function of u. But / ; ( 0;u) is not a linear function of u. 

The function / is particularly interesting. It is differentiable (and hence 
continuous) on each straight line through the origin. (In fact, on the straight 
line y = mar, it has the value mx/(m 2 + x 2 ).) But / is not differentiable at 
the origin; in fact, / is not even continuous at the origin! For / has value 0 
at the origin, while arbitrarily near the origin are points of the form (<, t 2 ), at 
which / has value 1/2. See Figure 5.1. 


//(M 2 ) = 1/2 



Figure 5.1 


Theorem 5.2. Let A C R m ; let f : A —* R n . If f is differentiable 
at a, then f is continuous at a. 

Proof. Let B = D /(a). For h near 0 but different from 0, write 


/(a + h) - /(a) = |h| 


/(a + h) — /(a) — -fl • h ~ 

M 


+ B • h. 


By hypothesis, the expression in brackets approaches 0 as h approaches 0. 
Then, by our basic theorems on limits, 

lim[/(a + h) — /(a)] = 0. 

h— ►O 


Thus / is continuous at a. □ 


We shall deal with composites of differentiable functions in § 7. 

Now we show how to calculate Df(a.) 1 provided it exists. We first intro- 
duce the notion of the “partial derivatives” of a real-valued function. 
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Definition. Let A C R m ; let / : A -* R. We define the j th partial 
derivative of / at a to be the directional derivative of / at a with respect 
to the vector ey, provided this derivative exists; and we denote it by Djf( a). 
That is, 

D jf( a ) = }l^j(/( a + te i ) “ /( a )) A* 

Partial derivatives are usually easy to calculate. Indeed, if we set 
— f{a l , . . . , Qj — i j t) fljf-f-i, • • • ? flm)> 

then the 7 th partial derivative of / at a equals, by definition, simply the 
ordinary derivative of the function (j) at the point t = fly. Thus the partial 
derivative Djf can be calculated by treating . . . , Xj-i,Xj+i , . . . ,x m as 
constants, and differentiating the resulting function with respect to Xj , using 
the familiar differentiation rules for functions of a single variable. 

We begin by calculating the derivative c* / in the case where / is a 
real- valued function. 

Theorem 5.3. Let A C H m ; let f : A -> R. If f is differentiable 
at a, then 

D/(a) = [Di/(a) D 2 f{ a) ••• D m /(a)]. 

That is, if Df{ a) exists, it is the row matrix whose entries are the partial 
derivatives of / at a. 

Proof. By hypothesis, D /(a) exists and is a matrix of size 1 by m. Let 

Df ( a ) = [A x A 2 ... A m ]. 

It follows (using Theorem 5.1) that 

Dj /(a) = /'(a; e, ) = Df( a) • e,- = Xj. □ 

We generalize this theorem as follows: 

Theorem 5.4. Let A C R m ; let f : A -* R n . Suppose A contains 
a neighborhood of a. Let fc : A ^ R be the i th component function of f 
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(a) The function f is differentiable at a if and only if each component 
function fi is differentiable at a. 

(b) Iff is differentiable at a, then its derivative is the n by m matrix 
whose i th row is the derivative of the function 

This theorem tells us that 


Df( a) 


'Df^y 


so that Df{ a) is the matrix whose entry in row i and column j is Djfi( a). 
Proof. Let B be an arbitrary n by m matrix. Consider the function 
FH ^ /(a + h) — /(a) — B - h. 

m = jhj ’ 


which is defined for 0 < |i» I < e (for some e). Now .F(h) is a column matrix 
of size n by 1. Its i th entry satisfies the equation 

tp /u \ /»'( a + h) “ /*( a ) “ ( row * of^)-h 

Fi(h ] = ^ . 

Let h approach 0. Then the matrix F( h) approaches 0 if and only if each of 
its entries approaches 0. Hence if B is a matrix for which .F(h) — *■ 0, then the 
i th row of B is a matrix for which iq(h) — >■ 0. And conversely. The theorem 
follows. □ 


Let A C R m and / : A — *■ R n . If the partial derivatives of the component 
functions /,• of / exist at a, then one can form the matrix that has Dj fi (a) as 
its entry in row i and column j . This matrix is called the Jacobian matrix 
of f. If / is differentiable at a, this matrix equals Df( a). However, it is 
possible for the partial derivatives, and hence the Jacobian matrix, to exist, 
without it following that / is differentiable at a. (See Example 3 preceding.) 

This fact leaves us in something of a quandary. We have no convenient way 
at present for determining whether or not a function is differentiable (other 
than going back to the definition). We know that such familiar functions as 

sin (xy) and xy 2 -f ze xy 

have partial derivatives, for that fact is a consequence of familiar theorems 
from single-variable analysis. But we do not know they are differentiable. 

We shall deal with this problem in the next section. 
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REMARK. If m = 1 or n = 1, our definition of the derivative is simply 
a reformulation, in matrix notation, of concepts familiar from calculus. For 
instance, if / : R 1 — ► R 3 is a differentiable function, its derivative is the column 
matrix 

-mr 

Df(t)= K(t) . 

In calculus, / is often interpreted as a parametrized- curve, and the vector 


v = /i(t)ei + 

is called the velocity vector of the curve. (Of course, in calculus one is apt to 
use i,j , and k for the unit basis vectors in R 3 rather than ei,e 2 , and e 3 .) 

For another example, consider a differentiable function g : R 3 — *■ R . Its 
derivative is the row matrix 

Dg(x) - [Dis(x) D 2 g{x) D 2 g(x)], 

and the directional derivative equals the matrix product Dg(x) u. In calculus, 
the function g is often interpreted as a scalar field, and the vector field 


grad g — (Dig)ei + (D 2 g)e 2 + (D 2 g)e 2 

is called the gradient of g. (It is often denoted by the symbol Vg.) The 
directional derivative of g with respect to u is written in calculus as the dot 
product of the vectors grad g and u. 

Note that vector notation is adequate for dealing with the derivative of 
/ when either the domain or the range of / has dimension 1. For a general 
function / : R m — ^ * R”, matrix notation is needed. 

EXERCISES 

1. Let A c R m ; let / : A — ► R". Show that if /'(a; u) exists, then /'(a; cu) 
exists and equals c/'(a;u). 

2. Let / : R 2 — ► R be defined by setting /(0) = 0 and 

f{x,y) = xy/(x 2 + y 2 ) if (x,t/)^0. 

(a) For which vectors u ^ 0 does exist? Evaluate it when it 

exists. 

(b) Do Dif and D 2 f exist at 0? 

(c) Is / differentiable at 0? 

(d) Is / continuous at 0? 
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3. Repeat Exercise 2 for the function / defined by setting /( 0) = 0 and 

f{x,y)=x 2 y 2 /(x 2 y 7 +{y-x) 2 ) if (x,t/)/0. 

4. Repeat Exercise 2 for the function / defined by setting /( 0) = 0 and 

f{x,y) = x 3 /(x 2 +y 2 ) if (x,y)^ 0. 

5. Repeat Exercise 2 for the function 

f(x,y) = \x\ + \y\. 

6. Repeat Exercise 2 for the function 

f{x,y) = \ xy\ 1/2 . 

7. Repeat Exercise 2 for the function / defined by setting /( 0) = 0 and 

f( x >y) = x \y \/( x2 + y *) t/2 if (*,y)#o. 


§6. CONTINUOUSLY DIFFERENTIABLE FUNCTIONS 

In this section, we obtain a useful criterion for differentiability. We know that 
mere existence of the partial derivatives does not imply differentiability. If, 
however, we impose the (comparatively mild) additional condition that these 
partial derivatives be continuous, then differentiability is assured. 

We begin by recalling the mean- value theorem of single- variable analysis: 

Theorem 6.1 (Mean- value theorem). If (j> : [a, b] — + R is continu- 

ous at each point of the closed interval [a,b], and differentiable at each 
point of the open interval (a, b), then there exists a point c of (a, b) 
such that 

4>(b) - 4>(a) = (f>'(c) (b - a). □ 

In practice, we most often apply this theorem when (f> is differentiable on 
an open interval containing In this case, of course, (f> is continuous on 

[o,6]. 
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Theorem 6.2. Let A be open in R m . Suppose that the partial 
derivatives Djfi(x) of the component functions of f exist at each point 
x of A and are continuous on A. Then f is differentiable at each point 
of A. 

A function satisfying the hypotheses of this theorem is often said to be 
continuously differentiable, or of class C 1 , on A. 

Proof. In view of Theorem 5.4, it suffices to prove that each component 
function of / is differentiable. Therefore we may restrict ourselves to the case 

of a real- valued function / : A — + R . 

Let a be a point of A. We are given that, for some e, the partial derivatives 
Djf(x) exist and are continuous for |x — a| < e. We wish to show that / is 
differentiable at a. 

Step 1. Let h be a point of R m with 0 < |h| < e; let hi, . . . , h m be the 
components of h. Consider the following sequence of points of R m : 


Po — a, 

Pi = a + hiei, 

p 2 = a -f hie\ -{- /i 2 e 2 ? 


p m — a -f- h\G\ -(-*■* -h h m o m — a + li. 

The points p,- all belong to the (closed) cube C of radius | h | centered at a. 
Figure 6.1 illustrates the case where m = 3 and all /i,- are positive. 



Since we are concerned with the differentiability of /, we shall need to 
deal with the difference /(a + h) - /(a). We begin by writing it in the form 

m 

(*) /(a + h) - /(a) = 5^[/(p ; ) - /(Pj-i)]- 

j=i 
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Consider the general term of this summation. Let j be fixed, and define 

4>{t) = /(Pj-i + tej). 

Assume hj ^ 0 for the moment. As t ranges over the closed interval I with 
end points 0 and hj, the point Pj_i -f tej ranges over the line segment from 
P>-1 to p j\ this line segment lies in C , and hence in A. Thus (j) is defined for 
t in an open interval about I. 

As t varies, only the j th component of the point pj_i +tej varies. Hence 
because Djf exists at each point of A, the function <f> is differentiable on 
an open interval containing /. Applying the mean-value theorem to (j), we 
conclude that 

4>(hj ) ~ m = <f>'(Cj)hj 

for some point Cj between 0 and hj. (This argument applies whether hj is 
positive or negative.) We can rewrite this equation in the form 

(**) /(Pj) ~ /(Pj-l) = Djf(qj)hj, 

where qj is the point pj_i -|- CjGj of the line segment from p ; _i to p ; -, which 
lies in C . 

We derived (**) under the assumption that hj ± 0. If hj = 0, then (**) 
holds automatically, for any point qj of C. 

Using (**), we rewrite (*) in the form 

m 

(* * *) /( a + h ) - /( a ) = 

J= i 

where each point q j lies in the cube C of radius |h| centered at a. 

Step 2. We prove the theorem. Let B be the matrix 

B = [DJ( a) ... D m f( a)]. 

Then 

m 

B-h = '£D j f(a)h j . 

j= 1 

Using (***), we have 

/(a + h) — /(a) — B h A [Dif(g ,)- P, ■/(«)]/»,• 

N U M 

then we let h — *> 0. Since qj lies in the cube C of radius |h| centered at a, 
we have q j — * a. Since the partials of / are continuous at a, the factors in 
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brackets all go to zero. The factors hj / |h| are of course bounded in absolute 
value by 1. Hence the entire expression goes to zero, as desired. □ 

One effect of this theorem is to reassure us that the functions familiar to us 
from calculus are in fact differentiable. We know how to compute the partial 
derivatives of such functions as sin (xy) and xy 2 + ze xy , and we know that 
these partials are continuous. Therefore these functions are differentiable. 

In practice, we usually deal only with functions that are of class C l . While 
it is interesting to know there are functions that are differentiable but not of 
class C 1 , such functions occur rarely enough that we need not be concerned 
with them. 

Suppose / is a function mapping an open set A of R m into R n , and suppose 
the partial derivatives Dj fi of the component functions of / exist on A. These 
then are functions from A to R, and we may consider their partial derivatives, 
which have the form D k (Djfi) and are called the second-order partial 
derivatives of f . Similarly, one defines the third-order partial derivatives 
of the functions or more generally the partial derivatives of order r for 
arbitrary r. 

If the partial derivatives of the functions of order less than or equal 
to r are continuous on A, we say / is of class C r on A. Then the function / 
is of class C r on A if and only if each function Djfi is of class C r ~ x on A. 
We say / is of class C°° on A if the partials of the functions fi of all orders 
are continuous on A. 

As you may recall, for most functions the “mixed partial derivatives 
D k Dj fi and Dj D k fi 

are equal. This result in fact holds under the hypothesis that the function / 
is of class C 2 , as we now show. 

Theorem 6.3. Let A be open in R m ; let f : A -> R be a function 
of class C 2 . Then for each a G A, 


D k Djf( a) = DjD k f( a). 


Proof. Since one calculates the partial derivatives in question by letting 
all variables other than x k and Xj remain constant, it suffices to consider the 
case where / is a function merely of two variables. So we assume that A is 
open in R 2 , and that / : A — ► R 2 is of class C 2 . 


Step 1. 
for /. Let 


We first prove a certain “second-order” mean-value theorem 
Q = [a y a + h]x[b,b + k] 
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be a rectangle contained in A. Define 

A(/i, k) = f(a , 6) - f(a + h, b) - f(a , b + k) + f(a + h, b + k). 

Then A is the sum, with appropriate signs, of the values of / at the four 
vertices of Q. See Figure 6.2. We show that there are points p and q of Q 
such that 

A(/i, k) = D 2 D\f(p) • hh, and 
A (h,k) = DiD 2 f(q) • hk. 



By symmetry, it suffices to prove the first of these equations. To begin, 
we define 

<K*) = f(*,b + k)- f(s,b). 

Then <f)(a + h) — <p(a) = A (h, k ), as you can check. Because Dyf exists in A, 
the function <f> is differentiable in an open interval containing [a, a + h]. The 
mean-value theorem implies that 

</>(a + h) - <t>(a) = <j>'(s 0 ) • h 

for some Sq between a and a -f- h. This equation can be rewritten in the form 

(*) A(/i, k) = [Dif(s 0 , b + k) - Dif(s 0 ,b)] • h. 

Now s 0 is fixed, and we consider the function Dif(s 0 , t). Because D 2 Dif 
exists in A, this function is differentiable for t in an open interval about 
[6,6 + k]. We apply the mean- value theorem once more to conclude that 

(**) 


DJ(s 0 ,b + k) — Dif(so,b) ~ D 2 Dif(so,to) • k 
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for some t 0 between b and b + k. Combining (*) and (**) gives our desired 
result. 

Step 2. We prove the theorem. Given the point a = (a,b) of A and 
given t > 0, let Q t be the rectangle 

Qt = [a,a + 1] x [b,b + <]. 

If t is sufficiently small, Qt is contained in A; then Step 1 implies that 

\(t,t) = D 2 Dif(pt) • t 2 

for some point p* in Qt. If we let t —*• 0, then p* — * a. Because D 2 D\f is 
continuous, it follows that 

\(t,t)/t 2 -+ D 2 Dif(a) as t — ► 0. 

A similar argument, using the other equation from Step 1, implies that 

\(t,t)/t 2 — ► DiD 2 f(a) as t —*0. 

The theorem follows. □ 


EXERCISES 

1. Show that the function f(x,y) = \xy\ is differentiable at 0, but is not of 
class C 1 in any neighborhood of 0. 

2. Define / : R — ► R by setting /( 0) = 0, and 

/(f) =f 2 sin(l/<) if t^0. 

(a) Show / is differentiable at 0, and calculate / ; (0). 

(b) Calculate f'(t) if t ± 0. 

(c) Show f is not continuous at 0. 

(d) Conclude that / is differentiable on R but not of class C 1 on R. 

3. Show that the proof of Theorem 6.2 goes through if we assume merely 
that the partials Djf exist in a neighborhood of a and are continuous 
at a. 

4. Show that if A C R m and f : A — ► R, and if the partials Djf exist and 
are bounded in a neighborhood of a, then / is continuous at a. 

5. Let / : R 2 — ► R 2 be defined by the equation 

f(r,0) = (rcos0, rsinf?). 

It is called the polar coordinate transformation. 
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(a) Calculate Df and det Df. 

(b) Sketch the image under / of the set S = [1,2] x [0,x]. [Hint: Find 
the images under / of the line segments that bound 5.] 

6. Repeat Exercise 5 for the function / : R 2 — ► R 2 given by 

f(x,y) = ( x 2 - y 2 ,2xy). 

Take S to be the set 

S — {(£, y) | x 2 + y 2 < a 2 and x > 0 and y > 0}. 

[Hint: Parametrize part of the boundary of S by setting x — a cost and 
y = asinf; find the image of this curve. Proceed similarly for the rest of 
the boundary of 5.] 

We remark that if one identifies the complex numbers C with R 2 in 
the usual way, then / is just the function f(z) = z 2 . 

7. Repeat Exercise 5 for the function / : R 2 —+ R 2 given by 

f(x,y) = (e x cosy, e x siny). 

Take S to be the set S — [0,1] x [0, tt]. 

We remark that if one identifies C with R 2 as usual, then / is the 
function /(z) = e z . 

8. Repeat Exercise 5 for the function / : R 3 — ► R 3 given by 

/ (p, <f > , 9) — (p cos 9 sin 0, /?sin 9 sin 0, p cos 0). 

It is called the spherical coordinate transformation. Take S to be 
the set 

S = [1,2] x [0, 7r/2] x [0, 7r/2] . 

9. Let g : R — ► R be a function of class C 2 . Show that 

lim g( a + h ) -2g(<0 + g(a ~ h) ^ 

/»— o h 2 ' x 

[Hint: Consider Step 1 of Theorem 6.3 in the case /(#, y) — g(x -f j/).] 
*10. Define / : R 2 — + R by setting /( 0) = 0, and 

f{x,y) = xy(x 2 - y 2 )l{x 2 + y 2 ) if (x,y)^ 0. 

(a) Show D\f and JD 2 / exist at 0. 

(b) Calculate D\f and D^f at (x,y) ^ 0. 

(c) Show / is of class C 1 on R 2 . [Hint: Show Dif(x } y) equcils the prod- 
uct of y and a bounded function, and D 2 f(x,y) equals the product 
of x and a bounded function.] 

(d) Show that JD 2 D 1 / and D 1 JD 2 / exist at 0, but are not equal there. 
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§7. THE CHAIN RULE 


In this section we show that the composite of two differentiable functions 
is differentiable, and we derive a formula for its derivative. This formula is 
commonly called the “chain rule.” 

Theorem 7.1. Let A C R m ; let B CR n . Let 

f : A — R n and g:B->R p , 

with f(A ) C B. Suppose /(a) = b. If f is differentiable at a, and if g 
is differentiable at b, then the composite function go f is differentiable 
at a. Furthermore , 


D(gof)(a) = Dg(b)Df(a ), 

where the indicated product is matrix multiplication. 

Although this version of the chain rule may look a bit strange, it is really 
just the familiar chain rule of calculus in a new guise. You can convince 
yourself of this fact by writing the formula out in terms of partial derivatives. 
We shall return to this matter later. 

Proof. For convenience, let x denote the general point of R m , and let y 
denote the general point of R n . 

By hypothesis, g is defined in a neighborhood of b; choose e so that g(y) 
is defined for |y - b| < e. Similarly, since / is defined in a neighborhood of a 
and is continuous at a, we can choose S so that /(x) is defined and satisfies 
the condition |/(x) - b| < e, for |x - a| < 6. Then the composite function 
(9 0 /)( x ) = 9(f( x )) 1S defined for |x — a| < 6. See Figure 7.1. 



Figure 7.1 
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Step 1. Throughout, let A(h) denote the function 
A(h) = /(a + h)-/(a), 

which is defined for |h| < 8. First, we show that the quotient |A(h)j/|h| is 
bounded for h in some deleted neighborhood of 0. 

For this purpose, let us introduce the function F( h) defined by setting 
F( 0) = 0 and 


F(h) = 


[A(h) -£>/(«)• h] 

M 


for 0 < |h| < 8. 


Because / is differentiable at a, the function F is continuous at 0. Further- 
more, one has the equation 


(*) A(li) = Df(a) ■ li + |h|F(h) 

for 0 < |h| < 8, and also for li = 0 (trivially). The triangle inequality imphes 
that 

|A(h)| < m|D/(a)| |h| + |h||F(h)|. 

Now |F(h)| is bounded for li in a neighborhood of 0; in fact, it approaches 0 
as h approaches 0, Therefore |A(h)| / |h| is bounded on a deleted neighbor- 
hood of 0. 

Step 2. We repeat the construction of Step 1 for the function g. We 
define a function G(k) by setting Cr(0) = 0 and 

G(k) = g( b +> l ^ i_ g £ ^ for 0 <|k|< £ . 

1*1 

Because g is differentiable at b, the function G is continuous at 0. Further- 
more, for |kj < e, G satisfies the equation 

(**) g{ b + k) - £(b) = Dg(h) ■ k + |k|G(k). 

Step 3. We prove the theorem. Let li be any point of R m with |h| < 8. 
Then |A(h)| < e, so we may substitute A(h) for k in formula (**). After this 
substitution, b -f k becomes 


b ■+■ A(h) — /(a) + A(h) — /(a 4- h), 


so formula (**) takes the form 

g(/(a + h)) - g(/(a)) = Dg( b) • A(h) + |A(h)|G(A(h)) . 
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Now we use (*) to rewrite this equation in the form 

jjj [»(/(a + h)) - ff (/(«)) - Dg( b) • Df( a) ■ h] 

= Dg(b) ■ -F(h) + jiy | A(h)|G r (A(h)). 

This equation holds for 0 < |h| < 6. In order to show that go f is differentiable 
at a with derivative Dg( b) * Df{ a), it suffices to show that the right side of 
this equation goes to zero as h approaches 0. 

The matrix Dg{ b) is constant, while -F(h) — ► 0 as h — ► 0 (because F 
is continuous at 0 and vanishes there). The factor G(A(h)) also approaches 
zero as h — ► 0; for it is the composite of two functions G and A, both of 
which are continuous at 0 and vanish there. Finally, |A(h)|/ |h| is bounded 
in a deleted neighborhood of 0, by Step 1. The theorem follows. □ 

Here is an immediate consequence: 

Corollary 7.2. Let A be open in R m ; let B be open in R n . Let 
f : A — R n and g : B — FF, 

with f(A) c B. If f and g are of class C r , so is the composite function 
9°f- 

Proof The chain rule gives us the formula 

D(gof)(x) = Bg(f(x))-Bf(x), 
which holds for x £ A. 

Suppose first that / and g are of class C 1 . Then the entries of Dg are 
continuous real-valued functions defined on B\ because / is continuous on 
A, the composite function Dg[f(x)) is also continuous on A. Similarly, the 
entries of the matrix Df(x) are continuous on A. Because the entries of the 
matrix product are algebraic functions of the entries of the matrices involved, 
the entries of the product Dg(f(x)) • Df(x) are also continuous on A. Then 
g o / is of class C l on A. 

To prove the general case, we proceed by induction. Suppose the theorem 
is true for functions of class C r_1 . Let / and g be of class C r . Then the 
entries of Dg are real-valued functions of class C r ~ l on B. Now / is of class 
C r ~ l on A (being in fact of class C r ); hence the induction hypothesis implies 
that the function Dj^,(/(x)), which is a composite of two functions of class 
C r ~ 1 , is of class C r ~ 1 . Since the entries of the matrix D f (x) are also of class 
C r ~ l on A by hypothesis, the entries of the product Dg(f(x)) • Df(x) are 
of class C r ~ l on A. Hence go / is of class C r on A, as desired. 
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The theorem follows for r finite. If now / and g are of class C°° , then they 
are of class C r for every r, whence g o / is also of class C r for every r. □ 


As another application of the chain rule, we generalize the mean-value 
theorem of single- variable analysis to real- valued functions defined in R m . We 
will use this theorem in the next section. 

Theorem 7.3 (Mean- value theorem). Let A be open in R m ; let 
f : A -> R be differentiable on A. If A contains the line segment with 
end points a and a + h, then there is a point c = a-M 0 h with 0 < t 0 < 1 
of this line segment such that 

/(a -h li) — / (a) = Df(c) ■ h. 


Proof. Set (f)(t ) = /(a -f th)\ then <f> is defined for t in an open interval 
about [0, 1]. Being the composite of differentiable functions, <f) is differentiable; 
its derivative is given by the formula 

4>'(t) = Df(a + /li) • h. 

The ordinary mean- value theorem implies that 

#1) - m = ^p 0 ) • i 

for some to with 0 < to < 1. This equation can be rewritten in the form 
/(a + h) - /(a) = Df{ a + t 0 h) • h. □ 


As yet another application of the chain rule, we consider the problem of 
differentiating an inverse function. 

Recall the situation that occurs in single-variable analysis. Suppose </>(x) 
is differentiable on an open interval, with <p'(x ) > 0 on that interval. Then <f> 
is strictly increasing and has an inverse function ip, which is defined by letting 
ip(y) be that unique number x such that <p(x) = y. The function ip is in fact 
differentiable, and its derivative satisfies the equation 

V(y) = 1 /^( 3 ), 


where y — (p(x). 

There is a similar formula for differentiating the inverse of a function / 
of several variables. In the present section, we do not consider the question 
whether the function / has an inverse, or whether that inverse is differentiable. 
We consider only the problem of finding the derivative of the inverse function. 
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Theorem 7.4. Let A be open in R"; let f : A ->• R n ; let /(a) = b. 
Suppose that g maps a neighborhood of b into R”, that g(b) = a, and 

g(f(x)) = x 

for all x in a neighborhood of a. If f is differentiable at a and if g is 
differentiable at b, then 

Dg(b) = [X)/(a)]-‘. 

Proof. Let i : R" — ► R” be the identity function; its derivative is the 
identity matrix We are given that 

tf(/( x )) = *( x ) 

for all x in a neighborhood of a. The chain rule implies that 

Dg( b) • Df{ a) = I n . 

Thus Dg(h) is the inverse matrix to Df( a) (see Theorem 2.5). □ 

The preceding theorem implies that if a differentiable function f is to have 
a differentiable inverse, it is necessary that the matrix Df be non-singular. 
It is a somewhat surprising fact that this condition is also sufficient for a 
function / of class C 1 to have an inverse, at least locally. We shall prove this 
fact in the next section. 

REMARK. Let us make a comment on notation. The usefulness of well-chosen 
notation can hardly be overemphasized. Arguments that are obscure, and 
formulas that are complicated, sometimes become beautifully simple once the 
proper notation is chosen. Our use of matrix notation for the derivative is a 
case in point. The formulas for the derivatives of a composite function and an 
inverse function could hardly be simpler. 

Nevertheless, a word may be in order for those who remember the notation 
used in calculus for partial derivatives, and the version of the chain rule proved 
there. 

In advanced mathematics, it is usual to use either the functional notation 
<fi' or the operator notation D<f> for the derivative of a real-valued function 
of a real variable. ( D<p denotes a 1 by 1 matrix in this case!) In calculus, 
however, another notation is common. One often denotes the derivative 
by the symbol d<pfdx , or, introducing the “variable” y by setting y = (f>(x) , 
by the symbol dy/dx. This notation was introduced by Leibnitz, one of 
the originators of calculus. It comes from the time when the focus of every 
physical and mathematical problem was on the variables involved, and when 
functions as such were hardly even thought about. 
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The Leibnitz notation has some familiar virtues. For one thing, it makes 
the chain rule easy to remember. Given functions 0 : R — ► R and tp : R — ► R, 
the derivative of the composite function xp o 0 is given by the formula 

D(ip o 4>){x) = D0(0(x)) • jD0(x). 

If we introduce variables by setting y — <p{x) and z = xp(y), then the derivative 
of the composite function z = 0(0(;c)) can be expressed in the Leibnitz 
notation by the formula 


dz _ dz dy 
dx dy dx ' 

The latter formula is easy to remember because it looks like the formula for 
multiplying fractions! However, this notation has its ambiguities. The letter 
“z,” when it appears on the left side of this equation, denotes one function (a 
function of x)\ and when it appears on the right side, it denotes a different 
function (a function of y). This can lead to difficulties when it comes to 
computing higher derivatives unless one is very careful. 

The formula for the derivative of an inverse function is also easy to re- 
member. If y = <p(x) has the inverse function x = xp(y), then the derivative 
of ip is expressed in Leibnitz notation by the equation 

dx/dy= d^’ 

which looks like the formula for the reciprocal of a fraction! 

The Leibnitz notation can easily be extended to functions of severed vari- 
ables. If A C R m and / : A — * R, we often set 

y-f(x) = f(x i,...,x m ), 

and denote the partial derivative Dif by one of the symbols 

df nr dy 

dxi dxi ' 

The Leibnitz notation is not nearly as convenient in this situation. Con- 
sider the chain rule, for example. If 

/ : R m — ► R" and g : R n — R, 

then the composite function F — g o f maps R m into R, and its derivative is 
given by the formula 


DF(x) = X)jr(/(x)) • Df(x), 
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which can be written out in the form 


[D,F(x) D m F(x)] 

= [Dtg(m) 


’ Di /i (x) 

D n g(f(x))] 

_D,/„(x) 


Dmfl (x) 
D m fn(x) 


The formula for the j ih partial derivative of F is thus given by the equa- 
tion 

n 

D,F(x) = Y, D t g(f{x)) Dj/n(x). 

fc= 1 

If we shift to “variable” notation by setting y = /(x) and Z = y(y), this 
equation becomes 

dz _ flz < 9 y*: 

^ dy* dz, ’ 

J fc=i 

this is probably the version of the chain rule you learned in calculus. Only 
familiarity would suggest that it is easier to remember than (*)! Certainly 
one cannot obtain the formula for Ozj 0 x 3 by a simple-minded multiplication 
of fractions, as in the single- variable case. 

The formula for the derivative of an inverse function is even more trou- 
blesome. Suppose / : R 2 — <• R 2 is differentiable and has a differentiable inverse 
function g. The derivative of g is given by the formula 

Dg{y) = [Df{x)]~\ 


where y = /(x). In Leibnitz notation, this formula takes the form 

’dx\/dy\ dxi/dy^l 'dyi/dxi Oyi/Ox 2 ' -1 

.0x2 /dyi 0x2 /0y2. . 0 y 2 f 0 x\ 0y2/0x 2 . 

Recalling the formula for the inverse of a matrix, we see that the partial 
derivative 0xi(0y 3 is about as far from being the reciprocal of the partial 
derivative 0 y 3 / 0 xi as one could imagine! 
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EXERCISES 


1. Let / : R 3 — ► R 2 satisfy the conditions /( 0) = (1,2) and 


Df( 0) = 


[12 3 

0 0 1 


Let g : R 2 — ► R 2 be defined by the equation 


g(x,y) - (x + 2y+ l,Sxy). 


Find D(g o /)( 0). 

2. Let / : R 2 — R 3 and g : R 3 — R 2 be given by the equations 
/ (x) = ( e 2xi +X2 , 3^2 — cos x\ , x\ 4- x -2 + 2) , 

9(y) = ( 3 2/i +2 j/ 2 + yl, y*-y3 + 1 ). 


(a) If F(x) = ^(/(x)) , find DF(0). [Hint: Don’t compute F explicitly.] 

(b) If G(y) = f(g(y)), find DG{ 0). 

3. Let / : R 3 — + R and g : R 2 — * R be differentiable. Let F : R 2 — ► R be 
defined by the equation 

F(x,y) = f(x>y,g{x,y)). 

(a) Find DF in terms of the partials of / and g. 

(b) If F(x , y) = 0 for all (x, y), find D\g and Dig in terms of the partials 

of/. 

4. Let g : R 2 — * R 2 be defined by the equation g(x, y) = ( x,y + x 2 ). Let 
/ : R 2 -+ R be the function defined in Example 3 of § 5. Let h = fog. 
Show that the directional derivatives of / and g exist everywhere, but 
that there is a u ^ 0 for which h'( 0;u) does not exist. 


§8. THE INVERSE FUNCTION THEOREM 

Let A be open in R n ; let / : A — ► R n be of class C l . We know that for / 
to have a differentiable inverse, it is necessary that the derivative Df(x) of / 
be non-singular. We now prove that this condition is also sufficient for / to 
have a differentiable inverse, at least locally. This result is called the inverse 
function theorem. 

We begin by showing that non-singularity of Df implies that / is locally 
one-to-one. 
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Lemma 8.1. Let A be open in R n ; let f : A — » R" be of class C 1 . 
If Df( a) is non-singular, then there exists an a > 0 such that the 
inequality 

|/(x 0 ) - /(xi)| > a|x 0 - xij 

holds for all x 0 ,xx in some open cube C( a;e) centered at a. It follows 
that f is one-to-one on this open cube . 

Proof Let E ~ Df{ a); then E is non-singular. We first consider the 
linear transformation that maps x to E • x. We compute 


|x 0 -xi| = \E 1 • (E • x 0 - E ■ Xi)| 

< n\E~ 1 \ • \E • x 0 — E • xi|. 


If we set 2a = l/n\E 1 |, then for all x 0 ,xi in R n , 


\E -x 0 - E • xi| > 2a|x 0 - xi|. 


Now we prove the lemma. Consider the function 

^(x) = /(x)-£-x. 

Then DH(x) = Df(x) — E , so that DH{ a) = 0. Because H is of class C 1 , we 
can choose e > 0 so that \DH(x)\ < a/n for x in the open cube C = C( a; e). 
The mean- value theorem, applied to the i th component function of H , tells 
us that, given xo,xi E C, there is a c E C such that 

| j7,(xo) - tf,(xi)| = \DHi(c) • (x 0 - xi)| < n(a/n) |x 0 - xx|. 

Then for Xo,xi E C, we have 

a|x 0 — Xl | > |^T(x 0 ) — /f( Xl )| 

- |/(x 0 ) - E -x 0 - /(xi) + £.xi| 

> \E • xi - E • x 0 | - |/(xi) - /(x 0 )| 

> 2a|xj - x 0 | - |/(xi) - /(xo)|- 

The lemma follows. □ 

Now we show that non-singularity of Df , in the case where / is one-to- 
one, implies that the inverse function is differentiable. 
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Theorem 8.2. Let A be open in R n ; let f : A — ► R” be of class C r ; 
let B — }(A). If f is one-to-one on A and if Df{x) is non-singular for 
x E A, then the set B is open in R n and the inverse function g : B — ► A 
is of class C r . 

Proof. Step 1. We prove the following elementary result: If <f) : A — ► R 
is differentiable and if <f> has a local minimum at Xq E A , then D<f)(x. o) = 0. 

To say that <j) has a local minimum at Xq means that <^(x) > <^(xq) for 
all x in a neighborhood of Xo- Then given 0, 

(f>(x 0 -|- tu) - <^(x 0 ) > 0 
for all sufficiently small values of t. Therefore 

<^'(x 0 ;u) = lim [^(x 0 + tu) - <j>(x 0 )]/t 


is non-negative if t approaches 0 through positive values, and is non-positive 
if t approaches 0 through negative values. It follows that ^>'(xo;u) = 0. In 
particular, Dj(f>(xo) = 0 for all J, so that D</>(x o) = 0. 

Step 2. We show that the set B is open in R n . Given b E J5, we show B 
contains some open ball i?(b;$) about b. 

We begin by choosing a rectangle Q lying in A whose interior contains 
the point a = / -1 (b) of A. The set Bd Q is compact, being closed and 
bounded in R n . Then the set /(BdQ) is also compact, and thus is closed and 
bounded in R”. Because / is one-to-one, f(BdQ) is disjoint from b; because 
f(BdQ) is closed, we can choose 6 > 0 so that the ball B( b;2$) is disjoint 
from /(Bd<2). Given c E B(b;S) we show that c = /(x) for some x E Q\ it 
then follows that the set }{A) = B contains each point of B( b; £), as desired. 
See Figure 8.1. 



Figure 8.1 
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Given c E B(b,8), consider the real-valued function 

<Kx) = ll/(x) - c|| 2 , 

which is of class C r . Because Q is compact, this function has a minimum 
value on Q\ suppose that this minimum value occurs at the point x of Q. We 
show that /(x) = c. 

Now the value of <f) at the point a is 

#0 = ||/(a) - c|| 2 = ||b - c|| 2 < S 2 . 

Hence the minimum value of (f) on Q must be less than 8 2 . It follows that this 
minimum value cannot occur on BdQ, f° r if x E Bd Q, the point /(x) lies 
outside the ball i?(b;2<!>), so that ||/(x) - c|| > 8. Thus the minimum value 
of <f) occurs at a point x of Int Q. 

Because x is interior to Q , it follows that (f> has a local minimum at x; 
then by Step 1, the derivative of (f> vanishes at x. Since 


= 5 ^(/*( x ) ~ c k) 2 , 

k= 1 

Dj<K*) = E 

fc = l 


The equation D<f>(-x.) — 0 can be written in matrix form as 

2[(/i(x) -Ci) ••• (/„(x)-c n )] • Df(x) = 0. 

Now T)/(x) is non-singular, by hypothesis. Multiplying both sides of this 
equation on the right by the inverse of D /(x), we see that /(x) — c = 0, as 
desired. 

Step 3. The function / : A — ► B is one-to-one by hypothesis; let g : 
B — ► A be the inverse function. We show g is continuous. 

Continuity of g is equivalent to the statement that for each open set U of 
A , the set V ~ g~ l (U) is open in B. But V = f{U)\ and Step 2, applied to 
the set U , which is open in A and hence open in R n , tells us that V is open 
in R n and hence open in B . See Figure 8.2. 
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Figure 8.2 

It is an interesting fact that the results of Steps 2 and 3 hold without assuming 
that Df(x) is non-singular, or even that / is differentiable. If A is open in 
R" and f : A— > R n is continuous and one-to-one, then it is true that f(A) is 
open in R n and the inverse function g is continuous. This result is known as 
the Brouwer theorem on invariance of domain. Its proof requires the tools 
of algebraic topology and is quite difficult. We have proved the differentiable 
version of this theorem. 

Step 4 • Given b £ B , we show that g is differentiable at b. 

Let a be the point <?(b), and let E = Df(a). We show that the function 

/-vi.n _ W b + k ) ~ tf( b ) -E- 1 k] 

G(k) jkf ’ 

which is defined for k in a deleted neighborhood of 0, approaches 0 as k 
approaches 0. Then g is differentiable at b with derivative E ~ 1 . 

Let us define 

A ( k ) = 0(b + k) - g(b) 

for k near 0. We first show that there is an € > 0 such that |A(k)|/|k| is 
bounded for 0 < jk| < e. (This would follow from differentiability of g , but 
that is what we are trying to prove!) By the preceding lemma, there is a 
neighborhood C of a and an a > 0 such that 

|/(x 0 )-/(xi)|>a|x 0 -xi| 

for xo,xi £ C. Now f(C) is a neighborhood of b, by Step 2; choose e so that 
b+k is in f(C) whenever |k| < e. Then for |k| < e, we can set x 0 = g(b + k) 
and xi = <jf(b) and rewrite the preceding inequality in the form 

l(b + k) - b| > <%(b + k) - <7(b)|, 
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which implies that 

l/a > |A(k)|/|k|, 

as desired. 

Now we show that G(k) -»■ 0 as k -+ 0. Let 0 < |k| < €. We have 




by definition, 


_ _p-\ [ k-i?-A(k) l |A(k)| 

' L |A(k)| J |k| * 

(Here we use the fact that A(k) / 0 for k ^ 0, which follows from the fact 
that g is one-to-one.) Now E~ l is constant, and |A(k)|/|k| is bounded. It 
remains to show that the expression in brackets goes to zero. We have 

b + k = /( S (b + k)) = /(s(b) + A(k))=/(a + A(k)). 

Thus the expression in brackets equals 

/(a + A(k))-/(a)--E-A(k) 

|A(k)l 

L e t k — ► 0. Then A(k) — ► 0 as well, because g is continuous. Since / is 
differentiable at a with derivative E , this expression goes to zero, as desired. 

Step 5. Finally, we show the inverse function g is of class C r . 

Because g is differentiable, Theorem 7.4 applies to show that its derivative 
is given by the formula 

Dg( y) = {Df(g(y))]-\ 

for y G B. The function Dg thus equals the composite of three functions: 

B -L> A ^4 GL(n) -L GL( n), 

where GL(n) is the set of non-singular n by n matrices, and I is the function 
that maps each non-singular matrix to its inverse. Now the function / is given 
by a specific formula involving determinants. In fact, the entries of 1(C) are 
rational functions of the entries of C\ as such, they are C°° functions of the 
entries of C . 

We proceed by induction on r. Suppose / is of class C l . Then Df is 
continuous. Because g and I are also continuous (indeed, g is differentiable 
and I is of class C°°), the composite function, which equals Dg, is also 
continuous. Hence g is of class C 1 . 

Suppose the theorem holds for functions of class C r_1 . Let / be of 
class C r . Then in particular / is of class C r ~ l , so that (by the induction 
hypothesis), the inverse function g is of class C r ~ l . Furthermore, the function 
Df is of class C r_1 . We invoke Corollary 7.2 to conclude that the composite 
function, which equals Dg, is of class C r ~ 1 . Then g is of class C r . D 


Finally, we prove the inverse function theorem. 
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Theorem 8.3 (The inverse function theorem). Let A be open 
in R"; let f : A — R n be of class C r . If Df(x) is non-singular at 
the point a of A, there is a neighborhood U of the point a such that f 
carries U in a one-to-one fashion onto an open set V of R n and the 
inverse function is of class C r . 

Proof. By Lemma 8.1, there is a neighborhood Uq of a on which / is 
one-to-one. Because det D f (x) is a continuous function of x, and det Df( a) ^ 
0, there is a neighborhood U\ of a such that det Df(x) ^ 0 on U \ . If U equals 
the intersection of Uq and U\, then the hypotheses of the preceding theorem 
are satisfied for f : U — ► R n . The theorem follows. □ 


This theorem is the strongest one that can be proved in general. While 
the non-singularity of Df on A implies that / is locally one-to-one at each 
point of A, it does not imply that / is one-to-one on all of A. Consider the 
following example: 

EXAMPLE 1. Let / : R 2 — *• R 2 be defined by the equation 
f(r, 9) = (rcos9, rsin#). 


Then 


Df(r,0) 


'cos 0 — rsin#' 
. sin 9 r cos 9 


so that det Df(r , 9) = r. 

Let A be the open set (0, 1) x (0, b ) in the ( r,9 ) plane. Then Df is non- 
singular at each point of A. However, f is one-to-one on A only if b < 27T. 
See Figures 8.3 and 8.4. 




Figure 8.3 
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Figure 8.4 


EXERCISES 

1. Let / : R 2 — ► R 2 be defined by the equation 

f(x,y) - - y 2 > 2x v)- 

(a) Show that / is one-to-one on the set A consisting of all (x,y) with 
x > 0. [Hint: I( f(x,y) = /(a,h), then \\f{x,y)\\ = ||/(a, 6)||.] 

(b) What is the set B = /(A)? 

(c) If g is the inverse function, find Z)fi((0,l). 

2. Let / : R 2 — »> R 2 be defined by the equation 

f{x, y) = ( eX cos e * sin y)- 

(a) Show that / is one-to-one on the set A consisting of all ( x,y ) with 
0 < y < 2r. [Hint: See the hint in the preceding exercise.] 

(b) What is the set B = f(A)? 

(c) If g is the inverse function, find Dg(0 , 1). 

3. Let / : R n — ► R n be given by the equation /(x) = ]|x|| 2 • x. Show that 
/ is of class C°° and that / carries the unit ball S(0; 1) onto itself in 
a one-to-one fashion. Show, however, that the inverse function is not 
differentiable at 0. 

4. Let g : R 2 — ► R 2 be given by the equation 

g{x,y) = (2 ye 2x ,xe y ). 

Let / : R 2 — ► R 3 be given by the equation 

f( x ,y) = (3ar — t/ 2 , 2ar -h ?/, xy+y z ). 
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(a) Show that there is a neighborhood of (0, 1) that g carries in a one- 
to-one fashion onto a neighborhood of (2, 0). 

(b) Find Difog- 1 ) at (2,0). 

5. Let A be open in R n ; let / : A — ► R n be of class C r ; assume Df{x) is 
non-singular for xE 4. Show that even if / is not one-to-one on A, the 
set B = f{A) is open in R n . 


*§9. THE IMPLICIT FUNCTION THEOREM 


The topic of implicit differentiation is one that is probably familiar to you 
from calculus. Here is a typical problem: 

“Assume that the equation x 3 y + 2e xy = 0 determines y as 
a differentiable function of x. Find dy/dxT 

One solves this calculus problem by “looking at y as a function of and 
differentiating with respect to x. One obtains the equation 

3x 2 y -f x 3 dy/dx - 4 - 2 e xy (y + x dy/dx) = 0 , 


which one solves for dy/dx. The derivative dy/dx is of course expressed in 
terms of x and the unknown function y. 

The case of an arbitrary function / is handled similarly. Supposing that 
the equation f(x,y) = 0 determines y as a differentiable function of x , say 
V = 9( x )> the equation f(x,g(x)) = 0 is an identity. One applies the chain 
rule to calculate 

df/dx - 4 - {df /dy)g'(x) = 0 , 


so that 


</(*) = ~ 


df/dx 
df/dy ’ 


where the partial derivatives are evaluated at the point (£,<7(2;)). Note that 
the solution involves a hypothesis not given in the statement of the problem. 
In order to find g'(x), it is necessary to assume that d f / dy is non-zero at the 
point in question. 

It in fact turns out that the non- vanishing of df/dy is also sufficient 
to justify the assumptions we made in solving the problem. That is, if the 
function f(x,y) has the property that df/dy ^ 0 at a point (fl,6) that is a 
solution of the equation f{x,y) — 0, then this equation does determine y as 
a function of x , for x near a, and this function of x is differentiable. 
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This result is a special case of a theorem called the implicit function 
theorem, which we prove in this section. 

The general case of the implicit function theorem involves a system of 
equations rather than a single equation. One seeks to solve this system for 
some of the unknowns in terms of the others. Specifically, suppose that / : 
R fc + n —► R n is a function of class C l . Then the vector equation 

f {x\, . . . , Xk+ n ) — 0 

is equivalent to a system of n scalar equations in k + n unknowns. One would 
expect to be able to assign arbitrary values to k of the unknowns and to solve 
for the remaining unknowns in terms of these. One would also expect that 
the resulting functions would be differentiable, and that one could by implicit 
differentiation find their derivatives. 

There are two separate problems here. The first is the problem of finding 
the derivatives of these implicitly defined functions, assuming they exist; the 
solution to this problem generalizes the computation of g'(x) just given. The 
second involves showing that (under suitable conditions) the implicitly defined 
functions exist and are differentiable. 

In order to state our results in a convenient form, we introduce a new 
notation for the matrix D f and its submatrices: 

Definition. Let A be open in R m ; let / : A — ► R" be differentiable. Let 
be the component functions of /. We sometimes use the notation 

nf _ 3(/i,...,/n) 

J d(x u ...,x m ) 

for the derivative of f . On occasion we shorten this to the notation 

Df = df/dx. 

More generally, we shall use the notation 

d(xj 1 , . . . , Xj t ) 

to denote the k by £ matrix that consists of the entries of Df lying in rows 
i ik and columns j u . . . , jy. The general entry of this matrix, in row p 
and column q, is the partial derivative dfi p /dxj q . 

Now we deal with the problem of finding the derivative of an implicitly 
defined function, assuming it exists and is differentiable. For simplicity, we 
shall assume that we have solved a system of n equations in k + n unknowns 
for the last n unknowns in terms of the first k unknowns. 
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Theorem 9.1. Let A be open in R fc+n ; let f : A -* R n be differen- 
tiable. Write f in the form /(x, y), for x G R fc and y e R"; then Df has 
the form 




df/dx df/d y . 


Suppose there is a differentiable function g : B — ► R n defined on an open 
set B in R fc , such that 

f(x,g(x)) - 0 

for all x G B. Then for x G B, 


21 

dx 


(x,g(x)) + |^(x, g(x)) ■ Dg(x) = 0. 


This equation implies that if the n by n matrix df fd y is non-singular at 
the point (x,<y(x)), then 


Dg(x) = - 




-i 




Note that in the case n = k = 1, this is the same formula for the derivative 
that was derived earlier; the matrices involved are 1 by 1 matrices in that 
case. 


Proof. Given g , let us define h : B —> R fc+n by the equation 

h(x) = (x,tf(x)). 

The hypotheses of the theorem imply that the composite function 

H{x) - f{h(x)) = /(x,#(x)) 

is defined and equals zero for all x G B. The chain rule then implies that 


0 = DH (x) = Df (/i(x)) • Dh(x) 

21 

dy 





■ h ■ 


Dgl*). 


^(M x )) + |^( , *( x )) • D 9(x)- 


as desired. □ 


The preceding theorem tells us that in order to compute Dg ) we must 
assume that the matrix df/dy is non-singular. Now we prove that the non- 
singularity of dff dy suffices to guarantee that the function g exists and is 
differentiable. 
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Theorem 9.2 (Implicit function theorem). Let A be open in 
R fc +"; let f : A -* R" be of class C r . Write f in the form /(x,y), 
for x E R* and y G R". Suppose that (a,b) is a point of A such that 
/( a,b) = 0 and 

det ^(a,b) / 0. 

Then there is a neighborhood B of a in R fc and a unique continuous 
function g : B — + R n such that <?(a) = b and 

/(x,fif(x)) =0 

for all x E B. The function g is in fact of class C r . 


Proof, We construct a function F to which we can apply the inverse 
function theorem. Define F : A — * R fc "*’ n by the equation 


F(x,y) = (x,/(x,y)). 

Then F maps the open set A of R k+n into R fc x R n = R fc+n . Furthermore, 


DF = 


’ h 
df/d x 


o ' 

df/dy m * 


Computing det DF by repeated application of Lemma 2.12, we have 
det DF = dzt df /dy. Thus DF is non-singular at the point (a,b). 

Now i^(a,b) = (a,0). Applying the inverse function theorem to the map 
JP, we conclude that there exists an open set U x V of R fc+T1 about (a,b) 
(where U is open in R fc and V is open in R n ) such that: 

(1) F maps U x V in a one-to-one fashion onto an open set W in R fc+ ” 
containing (a, 0). 

(2) The inverse function G : W — +■ U x V is of class C r . 

Note that because F(x, y) = (x,/(x,y)), we have 


(x, y) = G(x,/(x,y)). 

Thus G preserves the first k coordinates, as F does. Then we can write G in 
the form 

G(x,z) = (x,/i(x,z)) 

for xGR 1 and z € R n ; here h is a function of class C r mapping W into R n . 

Let B be a connected neighborhood of a in R fc , chosen small enough that 
B x 0 is contained in W . See Figure 9.1. We prove existence of the function 
g ; B — ► R". If x G B, then (x,0) E W t so we have: 

G^O) = (x,/i(x, 0)), 

(x, 0) = F(x,h(x, 0)) = (x, / (x, h(x, 0))), 

0 = f(x,h(x,0)). 
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We set g(x) = h(x, 0) for xeB; then g satisfies the equation f(x y g(x)) = 0, 
as desired. Furthermore, 

(a,b) = G(a,0) = (a, /i(a, 0)); 
then b = <7(a), as desired. 

Now we prove uniqueness of g. Let g 0 : B -+ R" be a continuous function 
satisfying the conditions in the conclusion of our theorem. Then in particular, 
g 0 agrees with g at the point a. We show that if g 0 agrees with g at the point 
a 0 G B , then g 0 agrees with g in a neighborhood B 0 of a 0 . This is easy. 
The map g carries ao into V . Since go is continuous, there is a neighborhood 
Bq of a 0 contained in B such that go also maps Bo into V. The fact that 
/(x,<7o(x)) = 0 for x e Bq implies that 

F(x,g 0 (x)) = (x,0), so 
(x,g 0 (x)) = <?(x,0) = (x, h(x y 0)) . 

Thus g 0 and g agree on B 0 . It follows that g 0 and g agree on all of B: The set 
of points of B for which |<7(x) — <7o( x )| = 0 is open in B (as we just proved), 
and so is the set of points of B for which |< 7 (x) - <?o(x)| > 0 (by continuity 
of g and go). Since B is connected, the latter set must be empty. □ 


In our proof of the implicit function theorem, there was of course nothing 
special about solving for the last n coordinates; that choice was made simply 
for convenience. The same argument applies to the problem of solving for any 
n coordinates in terms of the others. 
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For example, suppose A is open in R 5 and / : A — ► R 2 is a function 
of class C r . Suppose one wishes to “solve” the equation f(x,y,z,u,v) — 0 
for the two unknowns y and u in terms of the other three. In this case, the 
implicit function theorem tells us that if a is a point of A such that /(a) = 0 


and 


det 


df 

d{y,u) 


(a) t 0, 


then one can solve for y and u locally near that point, say y — and 

u = 'ip(x^Z^v). Furthermore, the derivatives of (f> and ijj satisfy the formula 


d((j > , VO 

of r 1 

d f 1 

1 

II 

S' 

0(v, u ). 

d{x,z,v) m 


EXAMPLE 1. Let / : R 2 -+ R be given by the equation 

f{x,y) = x 2 + y 2 -5. 

Then the point (x,y) = (1,2) satisfies the equation f{x,y) = 0. Both Of/dx 
and df / dy are non-zero at (1,2), so we can solve this equation locally for 
either variable in terms of the other. In particular, we can solve for y in terms 
of x, obtaining the function 

y = 9{z) = [5- z 2 ] 1/2 - 


Note that this solution is not unique in a neighborhood of x = 1 unless we 
specify that g is continuous. For instance, the function 


M*) = 


{ 


[5-X 2 ] 1 ' 2 

-[5-* 2 ] 1 ' 2 


for x > 1 , 
for x < 1 


satisfies the same conditions, but is not continuous. See Figure 9.2. 




Figure 9.2 
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EXAMPLE 2. Let / be the function of Example 1. The point (x,y) — (a/5, 0) 
also satisfies the equation f(x,y) = 0. The derivative df /dy vanishes at 
(n/ 5,0), so we do not expect to be able to solve for y in terms of X near this 
point. And, in fact, there is no neighborhood B of \/5 on which we can solve 
for y in terms of X. See Figure 9.3. 



Figure 9.3 

EXAMPLE 3. Let / : R 2 — » R be given by the equation 

f(x,y) = x 2 - y 3 . 

Then (0,0) is a solution of the equation f(x,y) =0. Because df /dy vanishes 
at (0,0), we do not expect to be able to solve this equation for y in terms of 
x near (0,0). But in fact, we can; and furthermore, the solution is unique! 
However, the function we obtain is not differentiable at X = 0. See Figure 9.4. 



EXAMPLE 4. Let / : R 2 — + R be given by the equation 


f(x,y) = y 2 ~ X 4 . 


Then (0,0) is a solution of the equation f(x, y) = 0. Because df /dy vanishes 
at (0,0), we do not expect to be able to solve for y in terms of X near (0,0). In 
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fact, however, we can do so, and we can do so in such a way that the resulting 
function is differentiable. However, the solution is not unique. 



Now the point (1,2) also satisfies the equation f(x,y) = 0. Because 
df fdy is non-zero at (1,2), one can solve this equation for y as a continuous 
function of x in a neighborhood of a: = 1. See Figure 9.5. One can in fact 
express y as a continuous function of x on a larger neighborhood than the one 
pictured, but if the neighborhood is large enough that it contains 0, then the 
solution is not unique on that larger neighborhood. 


EXERCISES 


1. Let / : R 3 — ► R 2 be of class C 1 ; write / in the form f{x,yi,y 2 ). Assume 
that /( 3, — 1, 2) = 0 and 


D/(3,-l,2) = 


1 2 1 

Li -l U 


(a) Show there is a function g : B -*> R 2 of class C 1 defined on an open 
set B in R such that 

f(x,g x (x),g 2 {xj) = 0 

for X € B, and fif(3) = (—1, 2). 

(b) Find Dg{ 3). 

(c) Discuss the problem of solving the equation f(x,yi,y 2 ) = 0 for an 
arbitrary pair of the unknowns in terms of the third, near the point 
(3, -1,2). 

2. Given / : R 5 — ► R 2 , of class C 1 . Let a = (1, 2, -1 , 3, 0); suppose that 
/(a) = 0 and 

131-1 2 


Df( a) = 


Lo 0 1 2 —4 J 
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(a) Show there is a function g : B — ► R 2 of class C l defined on an open 
set B of R 3 such that 

f{xi,gi(x),g 2 (x),x 2i X3) = 0 

for x = (a?i,X 2 , z 3 ) € B, and g( 1, 3, 0) = (2, -1). 

(b) Find Dg( 1, 3, 0). 

(c) Discuss the problem of solving the equation /(x) = 0 for an arbitrary 
pair of the unknowns in terms of the others, near the point a. 

3. Let / : R 2 — *■ R be of class C 1 , with /( 2, —1) = —1. Set 

G(z,y,u) = f(z,y) + u 2 , 

H(x, y, u) = ux + 3 y 3 + u 3 . 

The equations G(x,y,u ) = 0 and H(x,y,u) = 0 have the solution 
(x,y,u) = (2, -1,1). 

(a) What conditions on Df ensure that there are C 1 functions x = g(y) 
and u = h(y ) defined on an open set in R that satisfy both equations, 
such that <7(— 1) = 2 and h(— 1) = 1? 

(b) Under the conditions of (a), and assuming that Df( 2, —1) = [l —3], 
find g'{— 1) and h'{— 1). 

4. Let F : R 2 — ■* R be of class C 2 , with ^(0,0) = 0 and DF( 0,0) = [2 3]. 
Let G : R 3 — * R be defined by the equation 

G(x,y,z) = F(x + 2y + 3z - l,x 3 + y 2 - z 2 ). 

(a) Note that G(— 2,3,— 1) = ^(0,0) = 0. Show that one can solve 
the equation G(x,y,z) — 0 for z, say z = g(x,y), for (x,y) in a 
neighborhood B of (—2,3), such that g(~ 2,3) = -1. 

(b) Find Dg(— 2,3). 

*(c) If D\D\F = 3 and D 1 D 2 F — —1 and D 2 D 2 F = 5 at (0,0), find 
D 2 D l9 (-2,3). 

5. Let /,<7 : R 3 — + R be functions of class C 1 . “In general,” one expects 
that each of the equations f(x, y, z) = 0 and g(x , y, z) — 0 represents a 
smooth surface in R , and that their intersection is a smooth curve. Show 
that if (x 0 ,yo,2o) satisfies both equations, and if d(f,g)/d(x,y,z) has 
rank 2 at (xo,yo, Zq ), then near (xo, t/o, ^o), one can solve these equations 
for two of x,y , z in terms of the third, thus representing the solution set 
locally as a parametrized curve. 

6. Let / : R fc+n — ► R n be of class C l \ suppose that /(a) = 0 and that Df{ a) 
has rank n. Show that if c is a point of R n sufficiently close to 0, then 
the equation /(x) = c has a solution. 




Integration 


In this chapter, we define the integral of real-valued function of several real 
variables, and derive its properties. The integral we study is called Riemann 
integral; it is a direct generalization of the integral usually studied in a first 
course in single-variable analysis. 


§10. THE INTEGRAL OVER A RECTANGLE 


We begin by defining the volume of a rectangle. Let 

Q = K,&i] x [a 2 ,b 2 ] x ••• x [a n ,b n ] 

be a rectangle in R”. Each of the intervals [ai,b *■] is called a component 
interval of Q. The maximum of the numbers &i — fli, — a„ is called 

the width of Q. Their product 

V(Q) = (£>1 - «i) (b 2 - a 2 ) ■ ■ ■ (b„ - a n ) 

is called the volume of Q. 

In the case n = 1, the volume and the width of the (1-dimensional) 
rectangle [a, b] are the same, namely, the number b — a. This number is also 
called the length of [a, b]. 


Definition. Given a closed interval [a, b] of R, a partition of [fit, 6] is 
a finite collection P of points of [a, b] that includes the points a and b. We 
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usually index the elements of P in increasing order, for notational convenience, 
as 

cl — to ^ h — b> 

each of the intervals [/j-i,/*], for i = 1,. . . is called a subinterval deter- 
mined by P, of the interval [a,b]. More generally, given a rectangle 


Q — [®i?^i] x x \_(i n , 6 n ] 

in R n , a partition P of Q is an n-tuple (Pi,...,P„) such that Pj is a 
partition of [dj,bj] for each j. If for each j, Ij is one of the subintervals 
determined by Pj of the interval [aj,bj], then the rectangle 

R = x • • x I n 

is called a subrectangle determined by P, of the rectangle Q. The maxi- 
mum width of these subrectangles is called the mesh of P. 

Definition. Let Q be a rectangle in R n ; let / : Q — *■ R; assume / is 
bounded. Let P be a partition of Q. For each subrectangle R determined 
by P, let 

m R (f) - inf{/(x) | x G P}, 

M R (f) = sup{/(x) j x e R}. 

We define the lower sum and the upper sum, respectively, of /, determined 
by P, by the equations 


£(/,. P) = E ro *(/)- *(*>• 

R 

U(J,P) = Y,M R (f)>v(R), 

R 

where the summations extend over all subrectangles R determined by P . 

Let P = (Pi,...,P„) be a partition of the rectangle Q. If P" is a 
partition of Q obtained from P by adjoining additional points to some or all 
of the partitions P l? . . . , P„, then P" is called a refinement of P. Given two 
partitions P and P' = (P/, . . . , P ' n ) of Q, the partition 

P" = (Pi U P^ . . . , P„ U P' n ) 

is a refinement of both P and P 7 ; it is called their common refinement. 

Passing from P to a refinement of P of course affects lower sums and 
upper sums; in fact, it tends to increase the lower sums and decrease the 
upper sums. That is the substance of the following lemma: 
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Lemma 10.1. Let P be a partition of the rectangle Q; let f : Q —> 
R be a bounded function. If P" is a refinement of P, then 


L(f,P)<L(f,P") and U(f,P")<U(f,P). 


Proof Let Q be the rectangle 


Q = KA] X ••• X KA]- 

It suffices to prove the lemma when P" is obtained by adjoining a single 
additional point to the partition of one of the component intervals of Q. 
Suppose, to be definite, that P is the partition (Pi,. .. ,P n ) and that P " is 
obtained by adjoining the point q to the partition Pi. Further, suppose that 
Pi consists of the points 

— t$ <L t\ < • • • < tk = b\ 

and that q lies interior to the subinterval [£*_!,/,]. 

We first compare the lower sums £(/, P) and L(/, P"). Most of the 
subrectangles determined by P are also subrectangles determined by P" . An 
exception occurs for a subrectangle determined by P of the form 

Rs = x S 

(where S is one of the subrectangles of [a 2 , 62] x • • • x [a n ,b n ] determined by 
{P'2, - - - , Pn))- The term involving the subrectangle Rs disappears from the 
lower sum and is replaced by the terms involving the two subrectangles 

R's = (?] x S and R % ~ [q,t { ] x S , 

which are determined by P n . See Figure 10.1. 



Figure 10.1 
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Now since m,R s (f) < /(x) for each x (E R' s and for each x € P5, it 
follows that 

rriRs(f ) < ™R' s (f) and m Rs(f) < 

Because v(R s ) — v(R' s ) + v(Rg) by direct computation, we have 

m R s (f) v ( R s) < m R ' s (f)v(R' s ) + m R %(f)v(Rs). 

Since this inequality holds for each subrectangle of the form Rs, it follows 
that 

L(f,P)<L(f,P") y 

as desired. 

A similar argument applies to show that U (/, P) > U □ 

Now we explore the relation between upper sums and lower sums. We 
have the following result: 

Lemma 10.2. Let Q be a rectangle ; let f : Q — ► R be a bounded 
function. If P and P ’ are any two partitions of Q, then 

L(f,P) < U (/, P'). 

Proof. In the case where P = P' , the result is obvious: For any sub- 
rectangle R determined by P, we have mR(f) < Mr({). Multiplying by 
v(R) and summing gives the desired inequality. 

In general, given partitions P and P' of Q , let P" be their common 
refinement. Using the preceding lemma, we conclude that 

L(f , P) < X(/, P") < U(f , P") < U(f , P')- □ 

Now (finally) we define the integral. 

Definition. Let Q be a rectangle; let f : Q —+ R be a bounded function. 
As P ranges over all partitions of Q , define 

/ / = sup{X(/,P)} and [ f — inf {P(/, P)}. 

Jg p Jq p 
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These numbers are called the lower integral and upper integral, respec- 
tively, of / over Q. They exist because the numbers £(/, P) are bounded 
above by U(f,P') where P* is any fixed partition of Q\ and the numbers 
U(f,P) are bounded below by L(f,P'). If the upper and lower integrals 
of / over Q are equal, we say / is integrable over Q, and we define the inte- 
gral of / over Q to equal the common value of the upper and lower integrals. 
We denote the integral of / over Q by either of the symbols 



EXAMPLE 1. Let f : [a, 6] — ► R be a non-negative bounded function. If P 
is a partition of I — [a, 6], then L(f,P) equals the total area of a bunch of 
rectangles inscribed in the region between the graph of / and the £-axis, and 
U (/, P) equals the total area of a bunch of rectangles circumscribed about 
this region. See Figure 10.2. 



The lower integral represents the so-called “inner area” of this region, 
computed by approximating the region by inscribed rectangles, while the up- 
per integral represents the so-called “outer area,” computed by approximating 
the region by circumscribed rectangles. If the “inner” and “outer” areas are 
equal, then / is integrable. 

Similarly, if Q is a rectangle in R 2 and / : Q — < ► R is non-negative and 
bounded, one can picture L(f,P) as the total volume of a bunch of boxes 
inscribed in the region between the graph of / and the ary- plane, and U (/, P) 



86 Integration 


Chapter 3 


as the total volume of a bunch of boxes circumscribed about this region. See 
Figure 10.3. 



Figure 10.3 


EXAMPLE 2. Let I = [0, l]. Let / : I — R be defined by setting f(x) = 0 if 
X is rational, and f(x) = 1 if x is irrational. We show that / is not integrable 
over I. 

Let P be a partition of I. If R is any subinterval determined by P , then 
rriR{f ) — 0 and Mr( f) = 1, since R contains both rational and irrational 
numbers. Then 

L(f,P) = J2 0v{R) = 0 ' 

R 

and 

l/(/,.P) = £ !•«.(*)= 1. 

R 

Since P is arbitrary, it follows that the lower integral of / over I equals 0, 
and the upper integral equals 1. Thus f is not integrable over 7. 

A condition that is often useful for showing that a given function is inte- 
grable is the following: 

Theorem 10.3 (The Riemann condition). Let Q be a rectangle; 
let f : Q —>R be a bounded function. Then 



equality holds if and only if given e > 0, there exists a corresponding 
partition P of Q for which 

U(f,P) — L(f, P) < e. 
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Proof. Let P' be a fixed partition of Q. It follows from the fact that 
Z(/, P) < U (/, P') for every partition P of Q, that 

f f<U(f,P '). 

J2 

Now we use the fact that P' is arbitrary to conclude that 

I / - 7 / - 

Js Jq 

Suppose now that the upper and lower integrals are equal. Choose a 
partition P so that Z(/, P) is within e/2 of the integral fg /, and a partition 
P' so that £/(/, P') is within e/2 of the integral fg f. Let P" be their common 
refinement. Since 

L(f,P) < L(f,P") < f f< U(f,P") < U(f,P'), 

Jq 

the lower and upper sums for / determined by P" are within e of each other. 
Conversely, suppose the upper and lower integrals are not equal. Let 


/ /- / /> o. 

Jq Js 

Let P be any partition of Q. Then 

£(/, P) < Jgf < J Q f<U(f,py, 

hence the upper and lower sums for / determined by P are at least e apart. 
Thus the Riemann condition does not hold. □ 

Here is an easy application of this theorem. 

Theorem 10.4. Every constant function f(x) = c is integrable. 
Indeed , if Q is a rectangle and if P is a partition of Q, then 


( c = c ■ v(Q) = cJ2 v (R), 

Jq r 


where the summation extends over all subrectangles determined by P. 
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Proof. If R is a subrectangle determined by P , then mji(f) — c — 
Mji(f). It follows that 

L(f,P) = c'£v(R) = U(f,P), 

R 

so the Riemann condition holds trivially. Thus /q c exists; since it lies between 

X(/,P) and U(f>P), it must equal c^2 r v(R). 

This result holds for any partition P. In particular, if P is the trivial 
partition whose only subrectangle is Q itself, 

f c — c- v(Q). □ 

JQ 


A corollary of this result, which we shall use in the next section, is the 
following: 

Corollary 10.5. Let Q be a rectangle in R n ; let {Q i, . . . ,Q*} be a 
finite collection of rectangles that covers Q. Then 

v(Q) 

i = 1 


Proof. Choose a rectangle Q' containing all the rectangles Qi, • • Qk- 
Use the end points of the component intervals of the rectangles Q, Qi, ■ • - , Qk 
to define a partition P of Q' . Then each of the rectangles Q, Qi, . . . , Qk i sa 
union of subrectangles determined by P . See Figure 10.4. 



Figure 10. 4 
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From the preceding theorem, we conclude that 

v (Q) = E 

RCQ 

where the summation extends over ail subrectangles contained in Q. Be- 
cause each such subrectangle R is contained in at least one of the rectangles 
Q 1 ? • • • 5 Qk, we have 


E v ( R ) <EE «(*)• 

RCQ »= 1 RCQ. 

Again using Theorem 10.4, we have 

E v( ' R '> = V C 30 ; 

RcQ. 


the corollary follows. □ 

A remark about notation. We shall often use a slightly different notation 
for the integral in the case n — 1 . In this case, Q is a closed interval [a, b] in 
R, and we often denote the integral of / over [a, 6 ] by one of the symbols 

tb f-x~b 

f or / f(x) 

J a J x~a 

instead of the symbol Jj a /. 

Yet another notation is used in calculus for the one-dimensional integral. 
There it is common to denote this integral by the expression 

f(x) dx, 

where the symbol “dx” has no independent meaning. We shall avoid this 
notation for the time being. In a later chapter, we shall give “dx” a meaning 
and shall introduce this notation. 

The definition of the integral we have given is in fact due to Darboux. An 
equivalent formulation, due to Riemann, is given in Exercise 7. In practice, it 
has become standard to call this integral the Riemann integral, independent 
of which definition is used. 




90 Integration 


Chapter 3 


EXERCISES 

1. Let f,g : Q — * R be bounded functions such that /(x) < g(x ) for x (E Q. 
Show that f Q f < fg 9 and f Q f<f Q g. 

2. Suppose / : Q — ► R is continuous. Show / is integrable over Q. [Hint: 
Use uniform continuity of /.] 

3. Let [0,1] 2 = [0,1] x [0,1]. Let / : [0, l] 2 ^ R be defined by setting 
f(x,y) = 0 if y ^ x, and f(x,y) = 1 if y = x. Show that / is integrable 
over [0, l] 2 . 

4. We say / : [0, 1] — » R is increasing if f{x i) < f(x 2 ) whenever X\ < £ 2 - 
If /, g : [0, l] — »• R are increasing and non-negative, show that the function 
h(x, y) = f(x)g{y) is integrable over [0, l] 2 . 

5. Let / : R — R be defined by setting f(x) - l/q if x = p/q, where p and 
q are positive integers with no common factor, and f(x) = 0 otherwise. 
Show / is integrable over [0, 1]. 

*6. Prove the following: 

Theorem. Let f : Q -+ R be bounded. Then f is integrable over Q if 
and only if given € > 0 , there is a 6 > 0 such that U(f y P) — L(f, P) < 
e for every partition P of mesh less than 6. 

Proof (a) Verify the “if” part of the theorem. 

(b) Suppose |/(x)| < M for x € Q. Let P be a partition of Q. Show 
that if P" is obtained by adjoining a single point to the partition of 
one of the component intervals of Q, then 

0 < L{f , P")-L{f, P) < 2M(mesh P) (width Q)"" 1 . 

Derive a similar result for upper sums. 

(c) Prove the “only iP part of the theorem: Suppose / is integrable 
over Q. Given e > 0, choose a partition P' such that U(f,P') — 
L{f y P 1 ) < c/2. Let N be the number of partition points in P # ; then 
let 

6 = e/SMN (width Q) n ~‘ 1 • 

Show that if P has mesh less than 6 , then U(f,P ) — L(f,P) < c. 
[Hint: The common refinement of P and P 1 is obtained by adjoining 
at most N points to P.] 

7. Use Exercise 6 to prove the following: 

Theorem. Let f : Q — ► R be bounded. Then the statement that f 
is integrable over Q, with J^f = A, is equivalent to the statement 
that given e > 0 , there is a 6 > 0 such that if P is any partition of 
mesh less than 6 , and if, for each subrectangle R determined by P, 
xr is a point of R, then 

|£/(x*Mfl)-4|<€. 

R 



§ 11 . 


Existence of the Integral 91 


$11. EXISTENCE OF THE INTEGRAL 


In this section, we derive a necessary and sufficient condition for the existence 
of the integral fg f. It involves the notion of a “set of measure zero.” 

Definition. Let A be a subset of R n . We say A has measure zero in 
R" if for every e > 0, there is a covering Q i, Q 2l • • • of A by countably many 
rectangles such that 

oo 

^v(Qi) < e. 

i-1 

If this inequality holds, we often say that the total volume of the rectangles 
Q i, Q 2 , ... is less than e. 


We derive some properties of sets of measure zero. 

Theorem 11.1. (a) If B C A and A has measure zero in R n , then 

so does B. 

(b) Let A he the union of the countable collection of sets A u A 2 , 

If each Ai has measure zero in R n , so does A. 

(c) A set A has measure zero in R n if and only if for every e > 0, 
there is a countable covering of A by open rectangles Int Qi, Int Q 2 , . .. 
such that 

CO 

J2 V (Q>) < «• 

* = 1 

(d) If Q is a rectangle in R n , then Bd Q has measure zero in R n 
but Q does not. 


Proof, (a) is immediate. To prove (b), cover the set Aj by countably 
many rectangles 

Qiji Qiji Qsji 

of total volume less than ej 2 J . Do this for each j. Then the collection of 
rectangles {Qij } is countable, it covers A f and it has total volume less than 

oo 

J2 ( / V = c - 

j = i 

(c) If the open rectangles Int Qi, Int Q 2 ,... cover A, then so do the 
rectangles Q ly Q 2 ,... . Thus the given condition implies that A has mea- 
sure zero. Conversely, suppose A has measure zero. Cover A by rectangles 
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Q \ , ... of total volume less than e/2. For each t, choose a rectangle Qi 

such that 

Qi C Int Qi and v(Q t ) < 2v(Q-). 

(This we can do because v(Q) is a continuous function of the end points of 
the component intervals of Q.) Then the open rectangles Int Q i, Int Qi,. . . 
cover A, and Yl v (Qi) < € - 
(d) Let 

Q = [ai,&i] x ••• x la n ,b n ]. 

The subset of Q consisting of those points x of Q for which Xi = a, is called 
one of the i th faces of Q . The other 2 th face consists of those x for which 
X{ = bj. Each face of Q has measure zero in R n ; for instance, the face for 
which Xi = (Li can be covered by the single rectangle 

[«i,fci] x • • • x [a i7 a,i 4- <5] x • • ■ x [a„,6«], 

whose volume may be made as small as desired by taking S small. Now Bd Q 
is the union of the faces of Q } which are finite in number. Therefore Bd Q 
has measure zero in R n . 

Now we suppose Q has measure zero in R n , and derive a contradiction. 
Set € = v(Q). We can by (c) cover Q by open rectangles Int Qi, Int Q 2 * • • • 
with Because Q is compact, we can cover Q by finitely many 

of these open sets, say Int Q 1 ,. . . , Int Qk ■ But 

k 

'(Go < 

1=1 

a result that contradicts Corollary 10.5. □ 

EXAMPLE 1. Allowing for a countably infinite collection of rectangles is an 
essential part of the definition of a set of measure zero. One would obtain 
a different notion if one allowed only finite collections. For instance, the set 
A of rational numbers in I = [0, l] is a countable union of one-point sets, so 
that A has measure zero in R by (b) of the preceding theorem. But A cannot 
be covered by finitely many intervals of total length less than e if c < 1. For 
suppose Ji, ... h is a finite collection of intervals covering A. Then the 
set B which is their union is a finite union of closed sets and therefore closed. 
Since B contains all rationals in /, it contains all limit points of these rationals; 
that is, it contains all of I . But this implies that the intervals 7i , . . . Ik cover 
I, whence by Corollary 10.5, 

k 

£v(I,)>v(I)= 1. 

«=1 


Now we prove our main theorem. 
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Theorem 11.2. Let Q be a rectangle in R n ; let f : Q -+ R be a 
bounded function. Let D be the set of points of Q at which f fails to be 
continuous. Then f exists if and only if D has measure zero in R n . 

Proof. Choose M so that |/(x)| < M for x E Q- 

Step 1. We prove the “if’ part of the theorem. Assume D has measure 
zero in R n . We show that / is integrable over Q by showing that given € > 0, 
there is a partition P of Q for which U (/, P) — L(f , P) < e. 

Given e, let e' be the strange number 

<r' = e/(2M + 2v(Q)). 

First, we cover D by countably many open rectangles Int Q\, Int Q 2 , ... of 
total volume less than t 7 , using (c) of the preceding theorem. Second, for each 
point a of Q not in D, we choose an open rectangle Int Q a containing a such 
that 

|/(x) - /(a)| < e' for xeQ a C\Q. 

(This we can do because / is continuous at a.) Then the open sets Int Qi 
and Int Q a , for i — 1,2,... and for a e Q — D, cover all of Q. Since Q is 
compact, we can choose a finite sub collection 

Int Q 1 , . . . , Int Q k , Int Q ai , . . . , Int Q &t 

that covers Q. (The open rectangles Int Qi, -*, Int Qk may not cover D , 
but that does not matter.) 

Denote Q &i by Q' for convenience. Then the rectangles 

Qx,..., Q fc , QI,..., Q' t 

cover Q, where the rectangles Q, satisfy the condition 

00 

(1) £>(Q.')<^ 

» = 1 

and the rectangles Qj satisfy the condition 

(2) l/(x)-/(y)|<2«f / for x,y eQjOQ. 

Without change of notation, let us replace each rectangle Qi by its inter- 
section with Q, and each rectangle Q' by its intersection with Q. The new 
rectangles {Qi} and {Q}} still cover Q and satisfy conditions (1) and (2). 

Now let us use the end points of the component intervals of the rectangles 
Q 1 ,..., Qk, Q !»*••» Q't to define a partition P of Q. Then each of the 
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rectangles Qi and Qj is a union of subrectangles determined by P . We 
compute the upper and lower sums of / relative to P . 


I II I I 
J± 


I I ' TT 


.1 IJ-L-L-L -LLIL-Ua LJ — 




'i T tS~‘ 

I II II ll I I 


-D 

-Reft 

-Ren' 


Figure 11.1 


Divide the collection of all subrectangles R determined by P into two 
disjoint sub collections 7 Z and 7 V , so that each rectangle R € 7£ lies in one of 
the rectangles Qi, and each rectangle R GlZ* lies in one of the rectangles Qj. 
See Figure 11.1. We have 

Y < 2M Y v ( R )> and 

Re n Ren 

Y ~ m R (f))v(R) < 2f' Y V ( R )’ 

Ren 1 Ren 1 

these inequalities follow from the fact that 

l/(*)-/(y)l<2M 

points x,y belonging to a rectangle R £ 1Z, and 
l/(x)-/(y)|<2<r' 

points x,y belonging to a rectangle R € RJ . Now 
* k 

v ( R ) < J2 JL = S and 

Ren i=i RcQt »= i 

Y »(R) < Y = ®(0). 

Thus 

P(/, P) - L(f , P) < 2Me' + 2c / t;(Q) = <f. 


for any two 


for any two 
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Step 2. We now define what we mean by the “oscillation” of a function / 
at a point a of its domain, and relate it to continuity of / at a. 

Given a £ Q and given S > 0, let As denote the set of values of /(x) at 
points x within 6 of a. That is, 

As — {/(x) | x £ Q and jx - a| < £}. 

Let Ms(f) = sup As, and let rris(f) = ini As. We define the oscillation of 
/ at a by the equation 

v(f\ a) = inf [Ms(f) - m* (/)]. 

U 

Then is non-negative; we show that / is continuous at a if and only 

if = 0. 

If / is continuous at a, then, given e > 0, we can choose 6 > 0 so that 
|/(x) — /(a)| < € for all x £ Q with |x — a| < 6. It follows that 

M*(/)</( a ) + e and m 6 (f) > /(a) - e. 

Hence u(f\ a) < 2e. Since e is arbitrary, v(f ;a) = 0. 

Conversely, suppose !/(/; a) = 0. Given e > 0, there is a S > 0 such that 

Ms(f) - ms(f) < e. 

Now if x € Q and |x — a| < S, 

ms(f) < /(x) < Ms(f). 

Since /(a) also lies between rns(f) and Ms(f), it follows that |/(x) — /(a)| < 
e. Thus f is continuous at a. 

Step 3. We prove the “only if’ part of the theorem. Assume / is inte- 
grable over Q. We show that the set D of discontinuities of f has measure 
zero in R n . 

For each positive integer m, let 

D rri = {a | ; a) > l/m}. 

Then by Step 2, D equals the union of the sets D m . We show that each set 
D m has measure zero; this will suffice. 

Let m be fixed. Given € > 0, we shall cover D m by countably many 
rectangles of total volume less than e. 

First choose a partition P of Q for which U(f,P) — L(f,P) < e/2m. 
Then let D' m consist of those points of D m that belong to Bd R for some 
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subrectangle R determined by P; and let D " consist of the remainder of D m . 
We cover each of D' m and D " by rectangles having total volume less than e/2. 

For D' m , this is easy. Given R , the set Bd R has measure zero in R n ; then 
so does the union \J R Bd R. Since D' m is contained in this union, it may be 
covered by countably many rectangles of total volume less than e/2. 

Now we consider £)" . Let R u . . . , Rk be those subrectangles determined 
by P that contain points of D" . We show that these subrectangles have 
total volume less than e/2. Given t, the rectangle R{ contains a point a of 
D" . Since a ^ Bd Ri, there is a S > 0 such that Ri contains the cubical 
neighborhood of radius S centered at a. Then 

l/m < ] a) < Ms(f) - m 6 (f) < M Ri ({ ) - m Ri (f). 

Multiplying by v(Ri) and summing, we have 
k 

Y,(l/ m )v(Ri) < U(f,P) - L(f,P) < e/2 m. 

1 = 1 

Then the rectangles R\, . . . , Rk have total volume less than e/2. □ 

We give an application of this theorem: 

Theorem 11.3. Let Q be a rectangle in R n ; let f : Q — *■ R; assume 
f is integrable over Q. 

(a) If f vanishes except on a set of measure zero, then f Q f = 0. 

(b) Iff is non-negative and if f = 0, then f vanishes except on a 
set of measure zero. 

Proof (a) Suppose / vanishes except on a set E of measure zero. Let 
P be a partition of Q. If R is a subrectangle determined by P, then R is 
not contained in E, so that / vanishes at some point of R. Then m R (f) < 0 
and M R (f) > 0. It follows that L(f,P) < 0 and U(f,P) > 0. Since these 
inequalities hold for all P , 

[ / < 0 and / / > 0. 

Jq Jq 

Since f Q f exists, it must equal zero. 

(b) Suppose /(x) > 0 and f Q f = 0. We show that if / is continuous 
at a, then /(a) = 0. It follows that / must vanish except possibly at points 
where f fails to be continuous; the set of such points has measure zero by the 
preceding theorem. 
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We suppose that / is continuous at a and that /( a) > 0 and derive a 
contradiction. Set € = /(a). Since / is continuous at a, there is a 8 > 0 such 
that 

/(x) > e/ 2 for |x — a| < 8 and x£Q. 

Choose a partition P of Q of mesh less than 8 . If Rq is a subrectangle 
determined by P that contains a, then TriR Q (f) > e/2. On the other hand, 
m f?(/) > 0 for a ll R- It follows that 

L(f, P) = Y1 "»*(/) v(R) > (t/2)v( J R 0 ) > 0. 

R 

But 

L(f,P)< f f = 0 . □ 

Jq 


EXAMPLE 2. The assumption that f exists is necessary for the truth of 
this theorem. For example, let I = [0,1] and let f(x) = 1 for x rational and 
f(x) = 0 for x irrational. Then f vanishes except on a set of measure zero. 
But it is not true that f f = 0, for the integral of / over I does not even 
exist. 


EXERCISES 

1. Show that if A has measure zero in R n , the sets A and Bd A need not 
have measure zero. 

2. Show that no open set in R n has measure zero in R n . 

3. Show that the set R n_1 x 0 has measure zero in R”. 

4. Show that the set of irrationals in [0, 1] does not have measure zero in R. 

5. Show that if A is a compact subset of R n and A has measure zero in R n , 
then given e > 0, there is a finite collection of rectangles of total volume 
less than e covering A. 

6. Let / : [a, 6] — *■ R. The graph of / is the subset 

Gj = \y = f(x)} 

of R 2 . Show that if / is continuous, Gj has measure zero in R 2 . [Hint: 
Use uniform continuity of /.] 

7. Consider the function / defined in Example 2. At what points of [0,1] 
does f fail to be continuous? Answer the same question for the function 
defined in Exercise 5 of §10. 
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8. Let Q be a rectangle in R n ; let / : Q — R be a bounded function. Show 
that if / vanishes except on a closed set B of measure zero, then f / 
exists and equals zero. 

9. Let Q be a rectangle in R n ; let / : Q — R; assume / is integrable over Q. 

(a) Show that if /(x) > 0 for x € Q, then f Q f > 0. 

(b) Show that if /(x) > 0 for x £ Q, then f Q f > 0. 

10. Show that if Qi , Q 2 ,... is a countable collection of rectangles covering 
Q, then v(Q) < J>W0- 


§12. EVALUATION OF THE INTEGRAL 

Given that a function / : Q — + R is integrable, how does one evaluate its 
integral? 

Even in the case of a function / : [a, 6] — ► R of a single variable, the 
problem is not easy. One tool is provided by the fundamental theorem of 
calculus, which is applicable when / is continuous. This theorem is familiar 
to you from single-variable analysis. For reference, we state it here: 

Theorem 12.1 (Fundamental theorem of calculus). (a) If f is 
continuous on [a y b], and if 

F(x) = ff 

J a 

for x € then F'(x) exists and equals f(x). 

(b) If f is continuous on [a,h] ; and if g is a function such that 
g'(x) = f(x) for x € [a,b], then 

i f = 9(b) - g(a). □ 


(When one refers to the derivatives F‘ and g' at the end points of the 
interval [a , 6] , one means of course the appropriate “one-sided” derivatives.) 
The conclusions of this theorem are summarized in the two equations 


D f f - f(x) and f Dg = g{x) - g(a). 
J a da 
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In each case, the integrand is required to be continuous on the interval in 
question. 

Part (b) of this theorem tells us we can calculate the integral of a contin- 
uous function f if we can find an antiderivative of /, that is, a function g 
such that g' = /. Part (a) of the theorem tells us that such an antiderivative 
always exists (in theory), since F is such an antiderivative. The problem, of 
course, is to find such an antiderivative in practice. That is what the so-called 
“Technique of Integration,” as studied in calculus, is about. 

The same difficulties of evaluating the integral occur with n-dimensional 
integrals. One way of approaching the problem is to attempt to reduce the 
computation of an n-dimensional integral to the presumably simpler prob- 
lem of computing a sequence of lower-dimensional integrals. One might even 
be able to reduce the problem to computing a sequence of one-dimensional 
integrals, to which, if the integrand is continuous, one could apply the funda- 
mental theorem of calculus. 

This is the approach used in calculus to compute a double integral. To 
integrate the continuous function f(x,y ) over the rectangle Q = [a, 6] x [c,d], 
for example, one integrates / first with respect to y , holding x fixed, and 
then integrates the resulting function with respect to x. (Or the other way 
around.) In doing so, one is using the formula 

f px—b ry — d 

/ / = / / f( x iV) 

JQ J x=a J y—c 

or its reverse. (In calculus, one usually inserts the meaningless symbols “ dx ” 
and “ dy ,” but we are avoiding this notation here.) These formulas are not 
usually proved in calculus. In fact, it is seldom mentioned that a proof is 
needed; they are taken as “obvious.” We shall prove them, and their appro- 
priate n-dimensional versions, in this section. 

These formulas hold when / is continuous. But when f is integrable but 
not continuous, difficulties can arise concerning the existence of the various 
integrals involved. For instance, the integral 



may not exist for all x even though fq f exists, for the function f can behave 
badly along a single vertical line without that behavior affecting the existence 
of the double integral. 

One could avoid the problem by simply assuming that all the integrals 
involved exist. What we shall do instead is to replace the inner integral in 
the statement of the formula by the corresponding lower integral (or upper 
integral), which we know exists. When we do this, a correct general theorem 
results; it includes as a special case the case where all the integrals exist. 



100 Integration 


Chapter 3 


Theorem 12.2 (Fubini’s theorem). Let Q = Ax B, where A is a 
rectangle in R fc and B is a rectangle in R n . Let f : Q — > ► R be a bounded 
function; write f in the form /(x, y) for x € A and y G B. For each 
x € A, consider the lower and upper integrals 



If f is integrable over Q, then these two functions of x are integrable 
over A, and 

f f = [ f /( x >y) = [ / /( x > y)- 

Jq JxeA JjreB Jx€A jy £B 

Proof. For purposes of this proof, define 

I(x)= f /(x,y) and 7(x) = f /(x,y) 

JxeB JyeB 

for x € A. Assuming Jg / exists, we show that / and I are integrable over 
A, and that their integrals equal f. 

Let P be a partition of Q. Then P consists of a partition Pa of A, and a 
partition P B of B. We write P = ( P a ,Pb )• If Ra is the general subrectangle 
of A determined by P A , and if R B is the general subrectangle of B determined 
by P B , then R A x R B is the general subrectangle of Q determined by P. 

We begin by comparing the lower and upper sums for / with the lower 
and upper sums for / and I. 

Step 1. We first show that 

L(f,P)<L(L,P A ); 

that is, the lower sum for / is no larger than the lower sum for the lower 
integral, /. 

Consider the general subrectangle R A x R B determined by P . Let xo be 
a point of R A - Now 

mR A xR B (f) < /(x o,y) 

for all y e R B ', hence 

m RAxRB (f) < m RB (f(x 0 ,y)). 

See Figure 12.1. Holding x 0 and R A fixed, multiply by v(R B ) and sum over 
all subrectangles R B . One obtains the inequalities 

m RAXRB (f)v(R B ) < Z(/(xo,y),ifi) < / /(x 0 ,y) = I( x o)- 

JyeB 
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Figure 12.1 


This result holds for each x 0 G Ra • We conclude that 

^m R/lxRB (f)v(R B ) < m R JL). 

Rb 


Now multiply through by v(R A ) and sum. Since v(R a )v(Rb) = v(Ra x Rb), 
one obtains the desired inequality 


L(f,P)<L(LP A ). 


Step 2. An entirely similar proof shows that 

U (/, P) >U ( 7 , Pa)', 

that is, the upper sum for / is no smaller than the upper sum for the upper 
integral, I. The proof is left as an exercise. 

Step 3. We summarize the relations that hold among the upper and 
lower sums of /,/, and / in the following diagram: 

<U(LP a )< 

L{{,P)<L{L,Pa) _ U{I,P A )<UU,P)- 

<L(I,Pa)< 

The first and last inequalities in this diagram come from Steps 1 and 2. Of the 
remaining inequalities, the two on the upper left and lower right follow from 
the fact that L{h,P) < U{h,P) for any h and The ones on the lower left 
and upper right follow from the fact that /(x) < /(x) for all x. This diagram 
contains all the information we shall need. 
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Step 4- We prove the theorem. Because / is integrable over Q , we can, 
given e > 0, choose a partition P = (P A , Pb) of Q so that the numbers at the 
extreme ends of the diagram in Step 3 are within e of each other. Then the 
upper and lower sums for / are within e of each other, and so are the upper 
and lower sums for /. It follows that both L and I are integrable over A. 

Now we note that by definition the integral f A ]_ lies between the upper 
and lower sums of £. Similarly, the integral f A I lies between the upper and 
lower sums for I. Hence all three numbers 

f / and f I and f f 
Ja ja jq 

lie between the numbers at the extreme ends of the diagram. Because € is 
arbitrary, we must have 

/i= //= //. □ 

JA JA JQ 


This theorem expresses fgf as an iterated integral. To compute fg /, 
one first computes the lower integral (or upper integral) of / with respect to 
y, and then one integrates the resulting function with respect to x. There is 
nothing special about the order of integration; a similar proof shows that one 
can compute fg f by first taking the lower integral (or upper integral) of / 
with respect to x, and then integrating this function with respect to y. 

Corollary 12.3. Let Q — Ax B, where A is a rectangle in R* and 
B is a rectangle in R n . Let f : Q —+ R be a bounded function . If Jg / 
exists, and if f y£B /(x, y) exists for each x £ A, then 



Corollary 12.4. Let Q = Ji x • x where Ij is a closed interval 
in R for each j. If f : Q — * R is continuous , then 



Evaluation of the Integral 


EXERCISES 


1. Carry out Step 2 of the proof of Theorem 12.2. 

2. Let I = [0, 1]; let Q — I x I. Define / : Q — ► R by letting f(x,y) = l/q 
if y is rational and x = p/q y where p and q are positive integers with no 
common factor; let f(x,y) = 0 otherwise. 

(a) Show that f f exists. 

(b) Compute 



(c) Verify Fubini’s theorem. 

3. Let Q = A x B, where A is a rectangle in R* and B is a rectangle in R n . 
Let / : Q — R be a bounded function. 

(a) Let g be a function such that 



<g(x)< f /( x ,y) 


for all x £ A. Show that if f is integrable over Q, then g is integrable 
over A, and f Q f = f A 9 • [Hint: Use Exercise 1 of §10.] 

(b) Give an example where f exists and one of the iterated integrals 


/ / /(x,y) and 

\ [ /(x,y) 

'xeA JyeB 



exists, but the other does not. 

*(c) Find an example where both the iterated integrals of (b) exist, but 
the integral / does not. [Hint: One approach is to find a subset 
S of Q whose closure equals Q, such that S contains at most one 
point on each vertical line and at most one point on each horizontal 
line.] 

4. Let A be open in R 2 ; let / : A — > R be of class C 2 . Let Q be a rectangle 

contained in A. 

(a) Use Fubini’s theorem and the fundamental theorem of calculus to 
show that 

f DiD\f = [ D 1 D 2 f. 

JQ JQ 

(b) Give a proof, independent of the one given in §6, that DiDiffa) = 
D\D 2 f{x) for each x 6 A. 
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In the applications of integration theory, one usually wishes to integrate func- 
tions over sets that are not rectangles. The problem of finding the mass of a 
circular plate of variable density, for instance, involves integrating a function 
over a circular region. So does the problem of finding the center of gravity of 
a spherical cap. Therefore we seek to generalize our definition of the integral. 
That is not in fact difficult. 

Definition. Let S be a bounded set in R"; let / : S — ► R be a bounded 
function. Define fs : R n — ► R by the equation 

f<,( X ) = {f( X '> f0rXe5 ’ 

J v } 1 0 otherwise. 

Choose a rectangle Q containing S. We define the integral of / over S by 
the equation 



provided the latter integral exists. 

We must show this definition is independent of the choice of Q . That is 
the substance of the following lemma: 

Lemma 13 . 1 . Let Q and Q' be two rectangles in R n . Iff : R n -> R 
is a bounded function that vanishes outside QC\Q', then 



one integral exists if and only if the other does. 

Proof. We consider first the case where Q C Q' . Let E be the set 
of points of Int Q at which / fails to be continuous. Then both the maps 
/ : Q — ► R and f : Q* —*■ R are continuous except at points of E and 
possibly at points of Bd Q. Existence of each integral is thus equivalent to 
the requirement that E have measure zero. 

Now suppose both integrals exist. Let P be a partition of Q and let P" 
be the refinement of P obtained from P by adjoining the end points of the 
component intervals of Q. Then Q is a union of subrectangles R determined 
by P". See Figure 13.1. If R is a subrectangle determined by P" that is not 
contained in Q , then / vanishes at some point of R , whence fnR(f) < 0. It 
follows that 

L(f,P")< 52 m R (f)v(R)< 

RCQ 
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Figure 13.1 


An entirely similar argument shows that £/(/, P) > Jq f. Since P is an 
arbitrary partition of Q', it follows that f = fq, f. 

The proof for an arbitrary pair of rectangles Q,Q' involves choosing 
a rectangle Q" containing them both, and noting that f Q f= f f = 

f Q ’f- □ 

In the remainder of this section, we study the basic properties of this 
integral, and we obtain conditions for its existence. In the next section, we 
derive (as far as we are able) a method for its evaluation. 

Lemma 13.2. Let S be a subset of R n ; let f,g : S — ► R n . Let 
F, G : S -+ R n be defined by the equations 

F(x) = max{/(x), #(x)} and G(x) = min{/(x), ^(x)}. 

(a) If f and g are continuous at x 0 , so are F and G. 

(b) If f and g are integrable over S, so are F and G. 

Proof, (a) Suppose / and g are continuous at xo. Consider first the 
case in which /(x 0 ) = fif(x 0 ) = r. Then F(x 0 ) - 6 ^X 0 ) = r. By continuity, 
given € > 0 , we can choose 6 > 0 so that 

|/(x)— r|<€ and |gr(x)-r|<e 

for |x — x 0 | < 6 and x G 5; for such values of x, it follows automatically that 

| F(x) - F(x 0 )| < € and |(?(x) - G^xo)! < €. 

On the other hand, suppose /(x 0 ) > g(x 0 ). By continuity, we can find a 
neighborhood U of x 0 such that /(x) — g(x) > 0 for x € U and x G S. 
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Then -F(x) = /(x) and G(x) = g(x) on U C\ S, it follows that F and G are 
continuous at x 0 . A similar argument holds if /(xo) < <7(x 0 ). 

(b) Suppose / and g are integrable over S. Let Q be a rectangle con- 
taining S. Then fs and gs are continuous on Q except on subsets D and E, 
respectively, of Q, each of measure zero. Now 

F s (x) = max{/s(x),<7s(x)} and Gs(x) = min{/s(x),^s(x)}, 

as you can easily check. It follows that Fs and Gs are continuous on Q 
except on the set D U E, which has measure zero. Furthermore, Fs and 
Gs are bounded because fs and g$ are. Then F$ and Gs are integrable 
over Q . □ 


Theorem 13.3 (Properties of the integral). Let S be a bounded 
set in R n ; let f,g : S R be bounded functions. 

(a) (Linearity). If f and g are integrable over S \ so is af + bg, and 

[ (a f + bg) = a f f + b ( g. 

%) s •/ S v s 

(b) ( Comparison ). Suppose f and g are integrable over S. If /(x) < 
g(x) for x € S, then 

J f< J S- 

Furthermore , |/| is integrable over S and 


i / /i < [ i/i- 

Js Js 


(c) (Monotonicity). Let T C S. If f is non-negative on S and 
integrable over T and S, then 


jf<jf • 


(d) (Additivity). If S = Si U S 2 and f is integrable over S i and S 2 , 
then f is integrable over S and Si D S2,' furthermore 


j f = J f+j f-J /• 

Js Js Y Js 2 ./SjnSa 


Proof, (a) It suffices to prove this result for the integral over a rectangle 
since 

(a/ + bg) s = af s + bg s . 



§ 13 . 


The Integral over a Bounded Set 107 


So suppose / and g are integrable over Q. Then / and g are continuous except 
on sets D, E , respectively, of measure zero. It follows that the function af+bg 
is continuous except on the set D U E, so it is integrable over Q. 

We consider first the case where a, b > 0. Let P H be an arbitrary partition 
of Q. If R is a subrectangle determined by P" , then 

« m R (f) + b m R (g) < a /(x) + b g(x) 
for all x £ R. It follows that 

a m R (f) + b m R (g) < m R (af + bg), 


so that 


a L(f, P") + b L(g , P") < L(af + bg, P") < f (a/ + bg). 

Jq 


A similar argument shows that 

*VU>P") + bU(g,P")> f (af + bg). 

Jq 

Now let P and P* be any two partitions of Q, and let P" be their common 
refinement. It follows from what have just proved that 

a L(f, P) + bL(g , P') < f (a f + bg) < a U(f , P) + b U(g, P'). 

Jq 

Now by definition the number a f + b f Q g also lies between the numbers 
at the ends of this sequence of inequalities. Since P and P' are arbitrary, we 
conclude that 


f (af + bg) = a ! f + b f g. 

Jq Jq Jq 

Now we complete the proof by showing that 



/. 


Let P be a partition of Q\ let R be a subrectangle determined by P . For 
x G R, we have 

~M R (f) < -/(x) < —m R (f), 


so that 


-M R (f) < m R (-f) and M R (-f) < -m R (f). 
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Multiplying by v(R ) and summing, we obtain the inequalities 

-U(f,P) < L(-f,P) < / (-/) < U(-f,P) < -L(f, P). 

JQ 

By definition, the number - Jq f also lies between the numbers at the extreme 
ends of this sequence of inequalities. Since P is arbitrary, our result follows. 

(b) It suffices to prove the comparison property for the integral over a 
rectangle. So suppose /(x) < <7(x) for x G Q. If R is any rectangle contained 
in Q , then 

m R (f) < /(x) < g(x) 

for each x G R. Then m R (f) < m R (g). It follows that if P is any partition 
of Q, 

L(f, P) < L(g, P) < / g . 

JQ 

Since P is arbitrary, we conclude that 



The fact that | / | is integrable over S follows from the equation 
|/(x)| = max{/(x),-/(x)}. 

The desired inequality follows by applying the comparison property to the 
inequalities 

-|/(x)|</(x)<|/(x)|. 

(c) If / is non-negative and if T C S, then /t(x) < /s(x) for all x. One 
then applies the comparison property. 

(d) Let T — S i O 82- We prove / is integrable over S and T. Consider 
first the special case where / is non-negative on S . Let Q be a rectangle 
containing S . Then both fs 1 and fs 2 are integrable over Q by hypothesis. It 
follows from the equations 

/ s (x) = max{/sj(x), /s 2 (x)} and / T (x) = min{/ 5l (x), /s 2 (x)} 

that fs and fr are integrable over Q. 

In the general case, set 

/+(x) = max{/(x),0} and /_(x) = max{-/(x), 0}. 

Since / is integrable over £1 and 62 1 so are /+ an< ^ /-• s P^ c i a l case 

already considered, and /_ are integrable over S and T . Because 

/W = /+(x) -/-(x), 
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it follows from linearity that / is integrable over S and T. 

The desired additivity formula follows by applying linearity to the equa- 
tion 

/s(x) = /s,(x) + /s 2 (x) - /t(x). □ 


Corollary 13.4. Let S i, Sk be bounded sets in R n ; assume 

Si n Sj has measure zero whenever i / j. Let S = Si U • U Sk • If 
f : S — ► R is integrable over each set Si, then f is integrable over S and 


J f = / /+ +/ /• 

Js Js , Js k 


Proof. The case k = 2 follows from additivity, since the integral of / 
over SiC)S 2 vanishes by Theorem 11.3. The general case follows by induction. 
□ 


Up to this point, we have made no a priori restrictions on the functions / 
we deal with in integration theory, other than that they be bounded. In 
particular, we have not required / to be continuous. The reason is obvious; 
in order to define the integral f s f, even in the case where / is continuous on 
S, we needed to deal with the function fs, which need not be continuous at 
points of Bd S. 

However, our primary interest in this book is in integrals of the form f s f, 
where / is continuous on S . Therefore we make the following: 

Convention. Henceforth, we restrict ourselves in studying integra- 
tion theory to the integration of continuous functions f : S — ► R. 

Now we consider conditions under which the integral f s f exists. Even if 
we assume / is bounded and continuous on S , we need some sort of condition 
involving the set S to ensure that f s f exists. That condition is the following: 

Theorem 13.5. Let S be a bounded set in R"; let f : S -+ R be a 
bounded continuous function. Let E be the set of points x 0 of Bd S for 
which the condition 

lim /(x) = 0 

X — >x 0 

fails to hold. If E has measure zero, then f is integrable over S. 

The converse of this theorem also holds; since we shall not need it, we 
leave the proof to the exercises. 
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Proof. Let xo be a point of R n not in E. We show that the function fs 
is continuous at x 0 ; the theorem follows. 

If x 0 G Int 5, then the functions / and f s agree in a neighborhood of x 0 ; 
since / is continuous at xo, so is fs . If xq G Ext S , then fs vanishes in a 
neighborhood of xo- Suppose xo G Bd S‘, then xo may or may not belong 
to S. See Figure 13.2. Since x 0 i E , we know that /(x) -+ 0 as x approaches 
Xo through points of S. Since / is continuous, it follows that /(x o) = 0 if 
Xo belongs to S. It also follows, since /s(x) equals either /(x) or 0, that 
/ s (x) — + 0, as x approaches xo through points of R n . To show that f s is 
continuous at xo, we must show that /s(xo) = 0. If xo £ S, Ibis follows 
by definition. If Xo G S , then /s(xo) = /(x o), which vanishes, as noted 
earlier. □ 



Figure 13.2 


The same techniques may be used to prove the following theorem, which 
is sometimes useful: 

Theorem 13.6. Let S be a bounded set in R n ; let f : S — * R be a 
bounded continuous function; let A — Int S. Iff is integrable over S, 
then f is integrable over A, and f s f = f A /• 

Proof. Step 1. We show that if f s is continuous at x 0 , then fA is 
continuous and agrees with fs at xq. The proof is easy. If xo G Int S or 
x 0 G Ext 5, then fs and f A agree in a neighborhood of x 0 , and the result is 
trivial. Let x 0 G Bd S. Continuity of f s at x 0 implies that /s(x) /s(x 0 ) 

as x — ► x 0 . Arbitrarily near x 0 are points x not in S, for which /s(x) = 0; 
hence this limit must be 0. Thus /s(xo) = 0. Since /a( x ) equals either fsfa) 
or 0, we have /s(x) — ► 0 also as x — ► x 0 . Furthermore, f A (x 0 ) = 0 because 
xo ^ A. Thus f A is continuous at xo and agrees with fs at xo- 

Step 2. We prove the theorem. If / is integrable over S, then fs is 
continuous except on a set D of measure zero. Then f A is continuous at 
points not in J9, so / is integrable over A. Since fs — fA vanishes at points 
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not in D, we have fq(fs ~ /a) = 0, where Q is a rectangle containing S. 
Then f s f — f A /. □ 


EXERCISES 


1. Let f,g:S -+ R ; assume f and g are integrable over S. 

(a) Show that if / and g agree except on a set of measure zero, then 

/ s / = / s 9- 

(b) Show that if /(x) < <?(x) for x £ S and f s f — f s 9> then / an( l 9 
agree except on a set of measure zero. 

2. Let A be a rectangle in R fe ; let B be a rectangle in R”; let Q = A X B. 
Let / : Q — ► R be a bounded function. Show that if J ' f exists, then 

/ /(x,y) 

JyeB 

exists for x £ A — D, where D is a set of measure zero in R*. 


3. Complete the proof of Corollary 13.4. 

4. Let Si and S 2 be bounded sets in R"; let / : S -+ R be a bounded 
function. Show that if / is integrable over Si and S 2 , then / is integrable 
over S 1 -S 2 , and 




/• 


5. Let S be a bounded set in R n ; let / : S — ► R be a bounded continuous 
function; let A = Int S. Give an example where f f exists and J f 
does not. 


6. Show that Theorem 13.6 holds without the hypothesis that / is continuous 
on S. 


*7. Prove the following: 


Theorem. Let S be a bounded set in R n ; let f : S — *• R be a bounded 
function. Let D be the set of points of S at which f fails to be 
continuous. Let E be the set of points of Bd S at which the condition 

lim /(x) = 0 

X— x 0 

fails to hold. Then f f exists if and only if D and E have measure 
zero. 

Proof, (a) Show that fs is continuous at each point xo ^ D U E. 

(b) Let B be the set of isolated points of S; then B C. E because the 
limit cannot be defined if xo is not a limit point of S. Show that if 
fs is continuous at xo, then xo ^ D U ( E — B). 

(c) Show that B is countable. 

(d) Complete the proof. 
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§14. RECTIFIABLE SETS 


We now extend the volume function, defined for rectangles, to more general 
subsets of R n . Then we relate this notion to integration theory, and extend 
the Fubini theorem to certain integrals of the form f s f . 

Definition. Let S be a bounded set in R n . If the constant function 
1 is integrable over S , we say that S is rectifiable, and we define the ( n - 
dimensional) volume of S by the equation 

v(S) = f 1. 

Note that this definition agrees with our previous definition of volume when 
S is a rectangle. 

Theorem 14.1. A subset S of R n is rectifiable if and only if S is 
bounded and Bd S has measure zero . 

Proof. The function 1 5 that equals 1 on S and 0 outside S is continuous 
on the open sets Ext S and Int S. It fails to be continuous at each point of 
Bd S . By Theorem 11.2, the function Is is integrable over a rectangle Q 
containing S if and only if Bd S has measure zero. □ 


We list some properties of rectifiable sets. 

Theorem 14.2. (a) (Positivity). If S is rectifiable, v(S) > 0. 

(b) (Monotonicity). If Si and S 2 are rectifiable and if Si C S 2 , then 
v(Si) < v(S 2 ). 

(c) (Additivity). If Si and S 2 are rectifiable, so are Si U S 2 and 
Si n S 2 , and 

v(Si U S 2 ) = t>(5i) + ^(* 5 * 2 ) ~ v ($i n ^ 2 )* 

(d) Suppose S is rectifiable. Then v(S) = 0 if and only if S has 
measure zero. 

(e) If S is rectifiable, so is the set A = Int S, and v(S) = v(A). 

(f) If S is rectifiable, and if / : 5 — *• R is a bounded continuous 
function, then f is integrable over S. 

Proof. Parts (a), (b), and (c) follow from Theorem 13.3. Part (d) follows 
by applying Theorem 11.3 to the non-negative function Is- Part (e) follows 
from Theorem 13.6, and (f) from Theorem 13.5. □ 
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Let us make a remark on terminology. The concept of volume, as we have 
defined it, was called classically the theory of content (or Jordan content). 
This terminology distinguishes this concept from a more general one, called 
measure (or Lebesgue measure). This concept is important in the develop- 
ment of an integral called the Lebesgue integral, which is a generalization of 
the Riemann integral. 

Measure is defined for a larger class of sets than content is, but the two 
concepts agree when both are defined. A “set of measure zero” as we have 
defined it is in fact just a set whose Lebesgue measure exists and equals zero. 
Such a set need not of course be rectifiable. 

A set whose Lebesgue measure is defined is usually called measurable. 
But there is no universally accepted corresponding term for a set whose Jordan 
content is defined. Some call such sets “Jordan-measurable”; others refer to 
such sets as “domains of integration,” because bounded continuous functions 
are integrable over such sets. One student suggested to me that a set whose 
Jordan content is defined should be called “contented”! I have taken the 
term rectifiable , which is commonly used to refer to a curve whose length is 
defined, and have adopted it to refer to any set having volume (content). 

The class of rectifiable sets in R» 

is not easy to describe other than by 
the condition stated in Theorem 14.1. It is tempting to think, for instance, 
that any bounded open set in R n , or any bounded closed set in R n , should be 
rectifiable. That is not the case, as the following example shows: 

EXAMPLE 1. We construct a bounded open set A in R such that Bd A does 
not have measure zero. 

The rational numbers in the open interval (0,1) are countable; let us 

arrange them in a sequence q Xi q 2 Let 0 < a < 1 be fixed. For each t, 

choose an open interval (a,, 6*) of length less than a/2* that contains qi and 
is contained in (0,1). These intervals will overlap, of course, but that doesn’t 
matter. Let A be the following open set of R: 

A = (ai,6i) U (a 2 ,b 2 ) U --. 

We assume Bd A has measure zero and derive a contradiction. Set e = 

1 — a. Since Bd A has measure zero, we may cover Bd A by countably many 
open intervals of total length less than €. _Because A is a subset of [0,1] that 
contains each rational in (0,1), we have A = [0,1]. Since A = A U Bd A, 
the open intervals covering Bd A, along with the open intervals (a;, bi) whose 
union is A, give an open covering of the interval [0,1], The total length of 
the intervals covering Bd A is less than €, and the total length of the intervals 
covering A is less than ^a/2* — a. Because [0,1] is compact, it can be 
covered by finitely many of these intervals; the total length of these intervals 
is less than e -j- a < 1. This contradicts Corollary 10.5. 

We conclude this section by discussing certain rectifiable sets that are 
especially useful; they are called the “simple regions.” For these sets, a version 
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of the Fubini theorem holds, as we shall see. We shall use these results only 
in the examples and the exercises. 

Definition. Let C be a compact rectifiable set in R”" 1 ; let : C — > 
R be continuous functions such that </>(x) < ip(x) for x£C. The subset S 
of R n defined by the equation 

S = {(x,t) | x G C and 0(x) < t < 
is called a simple region in R n . 

There is nothing special about the last coordinate here. \i k + 1— n— 1, 
and if y and z denote the general points of R* and R*, respectively, then the 
set 

S' = {(y,t, a) |(y,a) € C and <£(y, z) < t < 0(y, z)} 
is also called a simple region in R a . 

* Lemma 14.3. If S is a simple region in R n , then S is compact 

and rectifiable. 

Proof. Let 5 be a simple region, as in the definition. We show that S 
is compact and that Bd S has measure zero. 

Step 1. The graph of <f> is the subset of R n defined by the equation 

G# - {(x,/) |x £ C and t = ^(x)}. 

We show that Bd S lies in the union of the three sets G $ and and 

D = {(x,f) | x E Bd C and <£(x) < t < ^(x)}. 

Since each of these sets is contained in S, it follows that Bd S C S, so that S 
is closed. Being bounded, S is thus compact. See Figure 14.1. 

Suppose that (xo,fo) belongs to none of the sets G^ y Gtp, or D. We show 
that (xo,fo) lies either in Int S or Ext S. As you can check, there are three 
possibilities: 

(1) xo^C, 

(2) x 0 € C and either t 0 < <£(x o) or t 0 > ift(xo), 

(3) xo € Int C and <£(x 0 ) < to < i/>(x 0 ). 

In case (1), there is a neighborhood U of xo disjoint from C. Then U X R is 
disjoint from S, so that (xo,^o) € Ext S. 

Consider case (2). Suppose that to < </>(xo). By continuity of (f> ) we can 
choose a neighborhood W of (xq, ^o) such that the function 0(x) — t is positive 
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for x E C and (x,t) 6 W. Then W is disjoint from S , so that (x 0 , Jo) € Ext 
5. A similar argument applies if t 0 > 0(X O ). 

Consider case (3). By continuity, there is a neighborhood U x V of (x 0 , Jo) 
in R n such that U C C and both functions J — (f>(x) and ip(x) — t are positive 
on U x V. Then JJ x V is contained in S, so that (x 0 , Jo) € Int S. 

Step 2. We show that G $ and G ^ have measure zero. 

It suffices to consider the case of G$. Choose a rectangle Q in R n_1 
containing the set C. Given e > 0, let e' be the number e' = e/2v(Q). 
Because (f> is continuous and C is compact, there is, by the theorem on uniform 
continuity, a 6 > 0 such that |^>(x) - <£(y)| < e' whenever x,y E C and 
|x - y| < 6. Choose a partition P of Q of mesh less than 6. If R is a 
subrectangle determined by P , and if R intersects C, then |<£(x) — <£(y)| < e' 
for x,y £ R D C. For each such R , choose a point xr of R 0 C and define 
Ir to be the interval 


Ir = [0(x«) - c',0(Xft) + e']. 

Then the n-dimensional rectangle R x Ir contains every point of the form 
(x,0(x)) for which x £ C fl R. See Figure 14.2. 

The rectangles R x Ir, as R ranges over all subrectangles that intersect 
C , thus cover G$. Their total volume is 

E x Ir) = E ( 2 0 - ^ V (Q ) = 6 - 

R R 
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Figure 14-2 


Step 3. We show the set D has measure zero; then the proof is complete. 
Because (f> and ^ are continuous and C is compact, there is a number M such 
that 

— M < <^>(x) < ?/>(x) < M 

for x € C. Given e > 0, cover Bd C by rectangles Q i, Q 2 , ... in R n_1 of total 
volume less than e/2 M. Then the rectangles Q, x [—M,M] in R n cover D 
and have total volume less than e. □ 

•Theorem 14.4 (Fubini’s theorem for simple regions). Let 

S = {(x,*)| x 6 C and </>(x) < t < ^(x)} 

be a simple region in R n . Let f : S — + R be a continuous function. Then 
f is integrable over S, and 

f = / / /(x,<). 

Jx.£C Jt=<f>(x ) 



Proof. Let Q x [-M, M] be a rectangle in R n containing S. Because / 
is continuous and bounded on S and S is rectifiable, / is integrable over S. 
Furthermore, for fixed xq 6 Q, the function /s(x 0 ,tf) is either identically zero 
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(if x 0 i C), or it is continuous at all but two points of R . We conclude from 
Fubini’s theorem that 



Jx£Q Jt — — M 


/sOM)- 


Since the inner integral vanishes if x ^ C , we can write this equation as 



r rt=Af 

/ / /s (*,«)• 

Jx£C — M 


Furthermore, the number /s(x,/) vanishes unless 4>{x) < t < ip(x), in which 
case it equals /(x,J). Therefore we can write 




The preceding theorem gives us a reasonable method for reducing the 
n-dimensional integral J s f to lower-dimensional integrals, at least if the in- 
tegrand is continuous and the set S is a simple region. 

If the set S is not a simple region, one can often in practice express S 
as a union of simple regions that overlap in sets of measure zero. Additivity 
of the integral tells us that we can evaluate the integral f s f by integrating 
over each of these regions separately and adding the results together. Just 
as in calculus, the procedure can be reasonably laborious. But at legist it is 
straightforward. 

Of course, there are rectifiable sets that cannot be broken up in this way 
into simple regions. Computing integrals over such sets is more difficult. One 
way of proceeding is to approximate S by a union of simple regions and follow 
a limiting procedure. 
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EXAMPLE 2. Suppose one wishes to integrate a continuous function f over 
the set S in R 2 pictured in Figure 14.3. While S is not a simple region, it is 
easy to break S up into simple regions that overlap in sets of measure zero, 
as indicated by the dotted lines. 


EXAMPLE 3. Consider the set S in R 2 given by 

S = {(z,y)l i <x 2 + y 2 < 4}; 

it is pictured in Figure 14.4. While S is not a simple region, one can evaluate 
an integral over S by breaking S up into two simple regions that overlap in 
a set of measure zero, as indicated, and integrating over each of these regions 
separately. The limits of integration will be rather unpleasant, of course. 

Now if one were actually assigned a problem like this in a calculus course, 
one would do no such thing! What one would do instead would be to express 
the integral in terms of polar coordinates, thereby obtaining an integral with 
much simpler limits of integration. 

Expressing a two-dimensional integral in terms of polar coordinates is a 
special case of a quite general method for evaluating integrals, which is called 
“substitution” or “change of variables.” We shall deal with it in the next 
chapter. 



Figure 14-4 

Let us make one final remark. There is one thing lacking in our discus- 
sion of the notion of volume. How do we know that the volume of a set is 
independent of its position in space? Said differently, if S is a rectifiable set, 


§ 14 . 


Rectifiable Sets 119 


and if h : R n — + R n is a rigid motion (whatever that means), how do we know 
that the sets S and h(S ) have the same volume? 

For example, each of the sets S and T pictured in Figure 14.5 represents 
a square with edge length 5; in fact T is obtained by rotating S through 
the angle 9 = arctan3/4. It is immediate from the definition that S has 
volume 25. It is clear that T is rectifiable, for it is a simple region. But how 
do we know T has volume 25? 



One can of course simply calculate v(T). One way to proceed is to write 
equations for the functions ip(x) and <f>(x) whose graphs bound T above and 
below respectively, and to integrate the function — <f>(x) over the interval 
[—3,4]. See Figure 14.6. 

Another way to proceed is to enclose T in a rectangle Q , take a partition P 
of Q , and calculate the upper and lower sums of the function It with respect 
to P . The lower sum equals the total area of all subrectangles contained in T, 



Figure 14-6 
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while the upper sum equals the total area of all subrectangles that intersect T . 
One needs to show that 

L(1 t ,P) <25< U (It, P) 
for all P. See Figure 14.7. 

Neither of these procedures is especially appealing! What one needs is a 
general theorem. In the next chapter, we shall prove the following result: 
Suppose h : R n — * R" is a function satisfying the condition 

II h(x) - h(y ) || = || x — y || 

for all x,y £ R n ; such a function is called an isometry. If S is a rectifiable 
set in R n , then the set T = h(S) is also rectifiable, and v(T) = v(S). 



Figure 14-7 

EXERCISES 

1. Let S' be a bounded set in R n that is the union of the countable collection 
of rectifiable sets Si , S 2 > • • - ■ 

(a) Show that Si U • • • U S n is rectifiable. 

(b) Give an example showing that S need not be rectifiable. 

2. Show that if Si and S 2 are rectifiable, so is Si — S 2 , and 

v(S 1 -S 2 ) = v(S 1 )-v{S 1 nS 2 ). 

3. Show that if A is a nonempty, rectifiable open set in R n , then v(A) > 0. 

4. Give an example of a bounded set of measure zero that is rectifiable, and 
an example of a bounded set of measure zero that is not rectifiable. 

5. Find a bounded closed set in R that is not rectifiable. 
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6. Let A be a bounded open set in R n ; let / : R n — * R be a bounded 
continuous function. Give an example where J— f exists but f A f does 
not. 

7. Let 5 be a bounded set in R n . 

(a) Show that if S is rectifiable, then so is the set S, and u(<S) = i>( 5). 

(b) Give an example where S and Int S are rectifiable, but S is not. 

8. Let A and B be rectangles in R* and R n , respectively. Let S' be a set 
contained in A x B. For each y € B, let 

S y = {x | x € A and (x,y) € -S}. 

We call S y a cross-section of S. Show that if S is rectifiable, and if S y 
is rectifiable for each y 6 B, then 

v(S)= f v(S y ). 

Jy€B 


§15. IMPROPER INTEGRALS 


We now extend our notion of the integral. We define the integral f s f in the 
case where S is not necessarily bounded and / is not necessarily bounded. 
Such an integral is sometimes called an improper integral. 

We shall define our extended notion of the integral only in the case 
where S is open in R". 

Definition. Let A be an open set in R n ; let / : A — * R be a continuous 
function. If / is non-negative on A, we define the (extended) integral of / 
over A, denoted f A /, to be the supremum of the numbers f D f, as D ranges 
over all compact rectifiable subsets of A, provided this supremum exists. In 
this case, we say that / is integrable over A (in the extended sense). More 
generally, if / is an arbitrary continuous function on A, set 

/ + (x) - max{/(x),0} and /_(x) = max{-/(x), 0}. 

We say that / is integrable over A (in the extended sense) if both /+ and 
/_ are; and in this case we set 

//=//+-//-, 

J A J A JA 

where f A denotes the extended integral throughout. 
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If A is open in R n and both / and A are bounded, we now have two 
different meanings for the symbol f A f. It could mean the extended integral, 
or it could mean the ordinary integral. It turns out that if the ordinary 
integral exists, then so does the extended integral and the two integrals are 
equal. Nevertheless, some ambiguity persists, because the extended integral 
may exist when the ordinary integral does not. To avoid ambiguity, we make 
the following convention: 

Convention. If A is an open set in R n , then f A f will denote the 
extended integral unless specifically stated otherwise. 

Of course, if A is not open, there is no ambiguity; f A f must denote the 
ordinary integral in this case. 

We now give a reformulation of the definition of the extended integral that 
is convenient for many purposes. It is related to the way improper integrals 
are defined in calculus. We begin with a preliminary lemma: 

Lemma 15.1. Let A be an open set in R n . Then there exists a 
sequence Ci, C 2 , ... of compact rectifiable subsets of A whose union is 
A, such that Cm C Int Cm+i for each N . 

Proof. Let d denote the sup metric d(x,y) = Jx — yj on R". If B C R n , 
let c/(x, B) denote the distance from x to B, as usual. (See §4.) 

Now set B = R n — A. Then given a positive integer N, let Dm denote 
the set 

J Dm = {x 1 d(x,Z?) > l/N and J(x,0) < N}. 

Since cf(x,J5) and ef(x,0) are continuous functions of x (see the proof of The- 
orem 4.6), Dm is a closed subset of R n . Because Dm is contained in the cube 
of radius N centered at 0, it is bounded and thus compact. Also, Dm is 
contained in A, since the inequality d(x,B) > l/N implies that x cannot be 
in B. To show the sets Dm cover A , let x be a point of A. Since A is open, 
d(x.,B) > 0; then there is an N such that d(x,i?) > l/N and </(x,0) < N, 
so that x € Dm- Finally, we note that the set 

v4at + i = {x | d(x,i?) > l/(N + 1) and d(x, 0) < iV + 1} 

is open (because d(x,J9) and d(x, 0) are continuous). Since An + 1 is contained 
in Dm+i and contains Dm by definition, it follows that Dm C Int J9 jv+i- 
See Figure 15.1. 

The sets Dm are not quite the sets we want, since they may not be 
rectifiable. We construct the sets Cm as follows: For each x € Dm, choose a 
closed cube that is centered at x and is contained in Int Dm+i ■ The interiors 
of these cubes cover Dm', choose finitely many of them whose interiors cover 
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Figure 15.1 


Dm and let their union be Cat. Since Cm is a finite union of rectangles, it is 
compact and rectifiable. Then 

Dm C Int Cm C Cm C Int Dm+ i- 

It follows that the union of the sets Cm equals A and that Cm C Int Cm+i 
for each N . □ 


Now we obtain our alternate formulation of the definition: 

Theorem 15.2. Let A be open in R n ; let f : A —*■ R be continuous. 
Choose a sequence Cm of compact rectifiable subsets of A whose union 
is A such that Cm C Int Cm + i for each N. Then f is integrable over A 
if and only if the sequence f c \f\ is bounded . In this case , 



It follows from this theorem that / is integrable over A if and only if |/| 
is integrable over A. 

Proof. Step 1. We prove the theorem first in the case where / is 
non-negative. Here / = |/|. Since the sequence f c f is increasing (by 
monotonicity), it converges if and only if it is bounded. 
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Suppose first that f is integrable over A. If we let D range over all 
compact rectifiable subsets of A, then 

Z/- supd U / } = //’ 


since Cm is itself a compact 
f c ^ f is bounded, and 


rectifiable subset of A. It follows that the sequence 



Conversely, suppose the sequence f is bounded. Let D be an arbi- 
trary compact rectifiable subset of A. Then D is covered by the open sets 


Int Ci C Int C 2 C • • • , 


hence by finitely many of them, and hence by one of them, say Int Cm • Then 

/ / < / / < J im / /• 

J D J Cm AT-t-oo J Cn 

Since D is arbitrary, it follows that / is integrable over A, and 

/ / < Jim f f. 

Ja N-+ oo J Cn 

Step 2. Now let / : A — * R be an arbitrary continuous function. By 
definition, / is integrable over A if and only if /+ and /_ are integrable over A\ 
this occurs if and only if the sequences fc N /+ and Ic N /_ are bounded, by 
Step 1. Note that 

0 < /+(x) < |/(x)| and 0</-(x)<|/(x)|, 


while 

l/MI = /+(*) + /-(*)• 

It follows that the sequences fc„ / + and fc N /_ are bounded if and only if the 
sequence J Cn \ f\ is bounded. In this case, the former two sequences converge 
to f A /+ and f A f - , respectively. Since convergent sequences can be added 
term-by-term, the sequence 


I / = / u-f f- 

jCn «/ C n J C 


converges to f A /+ — f A f— i and the latter equals J A f by definition. D 


We now verify the properties of the extended integral; many are analogous 
to those of the ordinary integral. Then we relate the extended integral to the 
ordinary integral in the case where both are defined. 
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Theorem 15.3. Let A be an open set in R n . Let f,g:A—+ R be 
continuous functions. 

(a) (Linearity). If f and g are integrable over A, so is af + bg; and 

[ (a f + bg) = a f f + b f g. 

Ja Ja Ja 

(b) (Comparison). Let f and g be integrable over A. If /(x) < < 7 (x) 
for x e A, then 

JM- 


In particular, 


i / f\< / i/i- 

Ja Ja 


(c) (Monotonicity). Assume B is open and B c A. If f is non- 
negative on A and integrable over A, then f is integrable over B and 



/. 


(d) (Additivity) . Suppose A and B are open in R n and f is contin- 
uous on A U B. If f is integrable on A and B, then f is integrable on 
A U B and Af)B, and 


[ /=//+//-/ /. 

Jaub Ja Jb Jadb 


Note that by our convention, the integral symbol denotes the extended 
integral throughout the statement of this theorem. 


Proof. Let C/v be a sequence of compact rectifiable sets whose union is 
A, such that C/sf C Int C/v+i for all N. 

(a) We have 

/ \ a S + bg\ < |a| / |/| + |6| / |*|, 

J c w JCn Jew 

by the comparison and linearity properties of the ordinary integral. Since both 
sequences f c |/| and f c |<jr| are bounded, so is J Cn \af + bg\. Linearity now 
follows by talcing limits in the equation 

[ (a f + bg) = a [ f + b f g. 

JCfg JCf4 J C N 
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(b) If /(x) < p(x), one takes limits in the inequality 



(c) If D is a compact rectifiable subset of B , then D is also a compact 
rectifiable subset of A, so that 





by definition. Since D is arbitrary, / is integrate over B and f B f ^ f A f. 

(d) Let be a sequence of compact rectifiable sets whose union is B 
such that Dm C Int D N + 1 for each N. Let 


Em = Cm U Dm and Fm = Cn^Dm- 

Then Em and Fm are sequences of compact rectifiable sets whose unions equal 
AU B and Af]B, respectively. See Figure 15.2. 



Figure 15.2 


We show Em C Int Em+i and Fm C Int Fm+ i- If x G Em, then x is in 
either Cm or Dm • If the former, then some neighborhood of x is contained in 
Cm+i • If the latter, some neighborhood of x is contained in Dm+i ■ I n either 
case, this neighborhood of x is contained in Em+ i, so that x G Int Em+ i- 
Similarly, if x G Fm, then some neighborhood U of x is contained in 
Cm+i , and some neighborhood of V of x is contained in Dm+ i- The neigh- 
borhood U n V of x is thus contained in Fm+ i, so that x G Int Fm+ i- 
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Additivity of the ordinary integral tells us that 


(*) 



/• 


Applying this equation to the function |/j, we see 
are bounded above by 



that |/| and Jr„ I/I 


Thus / is integrable over AU B and Af)B. The desired equation now follows 
by taking limits in (*). □ 


Now we relate the extended integral to the ordinary integral. 

Theorem 15.4. Let A be a bounded open set in R"; let f : A — + 
R be a bounded continuous function. Then the extended integral f A f 
exists. If the ordinary integral f A f also exists , then these two integrals 
are equal. 


Proof. Let Q be a rectangle containing A. 

Step 1. We show the extended integral of / exists. Choose M so that 
|/(x)| < M for x G A. Then for any compact rectifiable subset D of A, 

f I/I < / M < M ■ v(Q). 

Jd Jd 


Thus / is integrable over A in the extended sense. 

Step 2. We consider the case where / is non-negative. Suppose the 
ordinary integral of / over A exists. It equals, by definition, the integral 
over Q of the function /^. If D is a compact rectifiable subset of A, then 



< 



because f — f A on J9, 
by monotonicity, 


= (ordinary) / /. 

JA 

Since D is arbitrary, it follows that 

(extended) L f < (ordinary) //• 
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D 

A 


Figure 15.3 


On the other hand, let P be a partition of Q, and let R denote the general 
subrectangle determined by P. Denote by i£i, Rt those subrectangles 
that lie in A , and let D = R x U • • • U R k . See Figure 15.3. Now 


L(f A ,P) = ^2m Rt U)v(Ri), 


i = 1 


because m/?(/ j4 ) = m /*(/) if R is contained in A and mR(f A ) = 0 if R is not 
contained in A. On the other hand, 


k k . 

• v(Ri) < y Z I f by the comparison property, 

«= l *=i J R ' 

= / / by additivity, 

Jd 

< (extended) f by definition. 

J A 

Since P is arbitrary, we conclude that 

(ordinary) / / < (extended) / /. 

Ja Ja 


Step 3. Now we consider the general case. Write / = /+ — /_ , as usual. 
Since / is integrable over A in the ordinary sense, so are / + and /_ , by 
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Lemma 13.2. Then 


(ordinary) / / = (ordinary) f+ — (ordinary) / /_ 

Ja Ja Ja 

= (extended) / /+ — (extended) / /_ 

Ja Ja 

= (extended) J f by definition. □ 


by linearity, 
by Step 2, 


EXAMPLE 1. If A is a bounded open set in R n and / : A — * R is a bounded 
continuous function, then the extended integral f A f exists, but the ordinary 
integral f A f may not. For example, let A be the open subset of R constructed 
in Example 1 of §14. The set A is bounded, but Bd A does not have measure 
zero. Then the ordinary integral f A 1 does not exist, although the extended 
integral f A 1 does. 

A consequence of the preceding theorem is the following: 

Corollary 15.5. Let S be a bounded set in R n ; let f : S — ► R be a 
bounded continuous function. If f is integrable over S in the ordinary 
sense, then 

(ordinary) f — (extended) / /. 

Js J Int 5 


Proof. One applies Theorems 13.6 and 15.4. □ 


This corollary tells us that any theorem we prove about extended integrals 
has implications for ordinary integrals. The change of variables theorem, 
which we prove in the next chapter, is an important example. 

We have already given two formulations of the definition of the extended 
integral, and we will give another in the next chapter. All these versions of 
the definition are useful for different theoretical purposes. Actually applying 
them to computational problems can be a bit awkward, however. Here is a 
formulation that is useful in many practical situations. We shall use it in some 
of the examples and exercises: 
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♦Theorem 15 . 6 . Let A be open in R n ; let f : A^R be continuous. 
Let U\ C U 2 C - be a sequence of open sets whose union is A. Then 
f A f exists if and only if the sequence f v \f\ exists and is bounded; in 
this case, 



Proof. It suffices, as usual, to consider the case where / is non-negative. 
Suppose the integral f A f exists. Monotonicity of the extended integral 
implies that / is integrable over Un and that for each N , 


/ l* /'• 

Ju N Ja 


It follows that the increasing sequence $ Un f converges, and that 

lim //<//■ 

JV-°o J Un J a 

Conversely, suppose the sequence Ju N f exists and is bounded. Let D 
be a compact rectifiable subset of A. Since D is covered by the open sets 
U\ C U2 C * - - , it is covered by finitely many of them, and hence by one of 
them, say U\j . Then, by definition, 

/ /< / S< lim / /. 

Jd Ju m n ^°° Ju n 

Since D is arbitrary, 

[ f < I™ f /• n 

Ja n -°° Ju n 


In applying this theorem, we usually choose Un so that it is rectifiable 
and / is bounded on Un] then the integral fu N f exists as an ordinary in- 
tegral (and hence as an extended integral) and can be computed by familiar 
techniques. See the examples following. 
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EXAMPLE 2. Let A be the open set in R 2 defined by the equation 
A = | x > 1 and y > 1}. 

Let f(x,y ) = 1 /x 2 y 2 . Then / is bounded on A , but A is unbounded. We 
could use Theorem 15.2 to calculate f A f, by setting Cn = [(TV -f 1 )/jV, N ] 2 
and integrating / over Cn. It is a bit easier to use Theorem 15.6, setting 
U N = (1 ,N) 2 and integrating / over Un^ See Figure 15.4. The set U N is 
rectifiable; f is bounded on Un because U n is compact and f is continuous 
on Un. Thus f f exists as an ordinary integral, so we can apply the Fubini 
theorem. We compute 


r rx =N ry—N 

f= / i/*v = ((N-i)/Ny. 

JUn J x= 1 J y~X 

We conclude that f A fs= 1. 




EXAMPLE 3. Let B - (0, l) 2 ; let f{x,y) = 1 /x 2 y 2 , as before. Here B is 
bounded but f is not bounded on B ; indeed, f is unbounded near each point 
of the x and y axes. However, if we set U N = (1 /N, l) 2 , then / is bounded 
on Un • See Figure 15.5. We compute 

f f ~ (-1 + iV) 2 . 


We conclude that J B f does not exist. 
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EXERCISES 

1. Let / : R — ► R be the function f(x) = x. Show that, given A € R, there 
exists a sequence Cn of compact rectifiable subsets of R whose union is 
R, such that Cn C Int Cn+ i for each N and 

lirri [ f = A. 

Jc N 

Does the extended integral / exist? 

2. Let A be open in R n ; let f,g : A — R be continuous; suppose that 
|/(x)| < g(x ) for x € A. Show that if f A g exists, so does f A f. (This 
result is analogous to the so-called Comparison test for the convergence 
of an infinite series.) 

3. (a) Let A and B be the sets of Examples 2 and 3; let /(x, y) — l^xy) 1 ^ 2 . 

Determine whether f A f and exist; if either does, calculate it. 

(b) Let C — {(z, y) | x > 0 and y > 0}. Let 

/(*,») = i/(x 2 + VZ)(y 2 + Vv)- 

Show that f c f exists; do not attempt to calculate it. 

4. Let f(x,y) = 1 /(y -h I) 2 - Let A and B be the open sets 

A = {(z, y) j x > 0 and x < y < 2z}, 

B = {(z,J/) | x > 0 and x 2 < y < 2z 2 }, 

of R 2 . Show that f A f does not exist; show that f B f does exist and 
calculate it. See Figure 15.6. 



Figure 15.6 
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5. Let f(x,y ) = l/x(xy ) 1/2 for x > 0 and y > 0. Let 

Ao = {(x,y) j 0 < x < 1 and x < y < 2z), 

B 0 = {(z, y) j 0 < x < 1 and x 2 < y < 2x 2 }. 

Determine whether / and J g f exist; if so, calculate. 

6. Let A be the set in R 2 defined by the equation 

A = {(z,y) | x > 1 and 0 < y < l/x}. 

Calculate f A l/xy 1 ^ 2 if it exists. 

*7. Let A be a bounded open set in R n ; let / : A — ► R be a bounded 
continuous function. Let Q be a rectangle containing A. Show that 

// = / (/+)* ~ / (/-)*• 

Ja IQ JQ 

*8. Let A be open in R n . We say f : A —* R is locally bounded on A if 
each x in A has a neighborhood on which / is bounded. Let ^(A) be 
the set of all functions / : A — *■ R that are locally bounded on A and 
continuous on A except on a set of measure zero. 

(a) Show that if / is continuous on A, then / € ^(A). 

(b) Show that if f is in ^(A), then f is bounded on each compact subset 
of A and the definition of the extended integral J* f goes through 
without change. 

(c) Show that Theorem 15.3 holds for functions f in T{A). 

(d) Show that Theorem 15.4 holds if the word “continuous” in the hy- 
pothesis is replaced by “continuous except on a set of measure zero.” 





Change of Variables 


In evaluating the integral of a function of a single variable, one of the most 
useful tools is the so-called “substitution rule.” It is used in calculus, for 
example, to evaluate such an integral as 

f (2x 2 + l) 10 (4x) dx\ 

Jo 

one makes the substitution y = 2x 2 + 1 , reducing this integral to the integral 

J % 10 dy, 

which is easy to evaluate. (Here we use the “ dx ” and “dy” notation of calcu- 
lus.) 

Our intention in this chapter is to generalize the substitution rule in two 
ways: 

(1) We shall deal with n-dimensional integrals rather than one-dimen- 
sional integrals. 

(2) We shall prove it for the extended integral, rather than merely for 
integrals of bounded functions over bounded sets. This will require 
us to limit ourselves to integrals over open sets in R n , but, as Corol- 
lary 15.5 shows, this is not a serious restriction. 

We call the generalized version of the substitution rule the change of vari- 
ables theorem. 
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§16. PARTITIONS OF UNITY 


In order to prove the change of variables theorem, we need to reformulate the 
definition of the extended integral f A f . This integral is obtained by breaking 
the set A up into compact rectifiable sets Cjv, and taking the limit of the 
corresponding integrals f c f. In our new approach, we instead break the 
function f up into functions fjg, each of which vanishes outside a compact set, 
and we take the limit of the corresponding integrals f A f jv. This approach has 
many advantages, especially for theoretical purposes; it will recur throughout 
the rest of the book. 

This approach involves a notion of comparatively recent origin in mathe- 
matics, called a “partition of unity,” which we define in this section. 

We begin with several lemmas. 

Lemma 16.1. Let Q be a rectangle in R n . There is a C°° function 
<f> : R n — ► R such that </>(x) > 0 for x G Int Q and 0(x) = 0 otherwise. 


Proof. Let / : R — ► R be defined by the equation 

ft \ f e ~ 1/X if x > °> 

f(x) = < 

l. 0 otherwise. 

Then f(x) > 0 for x > 0. It is a standard result of single-variable analysis 
that / is of class C °° . (A proof is outlined in the exercises.) Define 

g(x) = f(x) ■ f(l - x). 

Then g is of class C °° ; furthermore, g is positive for 0 < X < 1 and vanishes 
otherwise. See Figure 16.1. Finally, if 




Figure 16.1 
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Q = [«iA] x ••• x [fl n ,6 n ], 


define 


0(x) = g 


( 


X\ ~ Qi \ 
bi-ax) 



□ 


Lemma 16.2. Let A be a collection of open sets in R n ; let A be 
their union. Then there exists a countable collection Qi, Q 2 , ...of 
rectangles contained in A such that: 

(1) The sets Int Qi cover A. 

(2) Each Qi is contained in an element of A. 

(3) Each point of A has a neighborhood that intersects only finitely 
many of the sets Q t . 

Proof. It is not difficult to find rectangles Qi satisfying (1) and (2). 
Choosing them so they also satisfy (3), the so-called “local finiteness condi- 
tion,” is more difficult. 

Step 1. Let D i, D 2 , ... be a sequence of compact subsets of A whose 
union is A, such that D, C Int Di+i for each i. For convenience in notation, 
let Di denote the empty set for i < 0. Then for each i, define 

Bi — Di — Int Di-i. 

The set Bi is bounded, being a subset of Di\ and it is closed, being the 
intersection of the closed sets D{ and R n — Int Di-\. Thus Bi is compact. 
Also, Bi is disjoint from the closed set Di- 2 , since Di- 2 C Int Di- j. For 
each x £ Bi, we choose a closed cube C x centered at x that is contained in A 
and is disjoint from D,_ 2 ; also choose C x small enough that it is contained 
in an element of the collection of open sets A. See Figure 16.2. 



Figure 16.2 
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The interiors of the cubes C x cover Bi\ choose finitely many of these 
cubes whose interiors cover Bi\ let C* denote this finite collection of cubes. 
See Figure 16.3. 



Figure 16.3 


Step 2. Let C be the collection 


C = Ci U C 2 U • • • ; 

then C is a countable collection of rectangles (in fact, of cubes). We show this 
collection satisfies the requirements of the lemma. 

By construction, each element of C is a rectangle contained in an element 
of the collection A. We show that the interiors of these rectangles cover A. 
Given xG^, let i be the smallest integer such that x € Int Di. Then x is 
an element of the set Bi = Di — Int Since the interiors of the cubes 

belonging to the collection C, cover , the point x lies interior to one of these 
cubes. 

Finally, we check the local finiteness condition. Given x, we have x E 
Int Di for some i. Each cube belonging to one of the collections C,+ 2 ?£i+ 3 > • • • 
is disjoint from Di, by construction. Therefore the open set Int Di can inter- 
sect only the cubes belonging to one of the collections Thus x 

has a neighborhood that intersects only finitely many cubes from the collec- 
tion C. □ 

We remark that the local finiteness condition holds for each point x of 
A, but it does not hold for a point x of Bd/L Each neighborhood of such a 
point necessarily intersects infinitely many of the cubes from the collection C, 
as you can check. 
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Definition. If <f> : R n — ► R, then the support of (j) is defined to be 
the closure of the set {x | <^>(x) / 0}- Said differently, the support of (f> 
is characterized by the property that if x ^ Support <f>, then there is some 
neighborhood of x on which the function (f) vanishes identically. 

Theorem 16.3 (Existence of a partition of unity). Let A be 
a collection of open sets in R n ; let A be their union. There exists a 
sequence <j>i , fa , . . . of continuous functions : R n — ► R such that: 

(1) 0i(x) > 0 for all x. 

(2) The set Si = Support fa is contained in A. 

(3) Each point of A has a neighborhood that intersects only finitely 
many of the sets Si. 

(4) YlTLi <M X ) = 1 for each x € A. 

(5) The functions fa are of class C 00 . 

( 6 ) The sets Si are compact. 

(7) For each i, the set Si is contained in an element of A. 

A collection of functions {<&} satisfying conditions (l)-(4) is called a 
partition of unity on A. If it satisfies (5), it is said to be of class C°° ; if 
it satisfies (6), it is said to have compact supports; if it satisfies (7), it said 
to be dominated by the collection A. 

Proof Given A and A, let Q 1, Q 2 , ... be a sequence of rectangles in A 
satisfying the conditions stated in Lemma 16.2. For each i, let : R n — * R be 
a C°° function that is positive on Int Qi and zero elsewhere. Then ^,(x) > 0 
for all x. Furthermore, Support = Q t -; the latter is a compact subset 
of A that is contained in an element of A. Finally, each point of A has a 
neighborhood that intersects only finitely many of the sets Q { . The collection 
{ifii} thus satisfies all the conditions of our theorem except for (4). 

Condition (3) tells us that for x E A, only finitely many of the numbers 
^i(x),^ 2 ( x ), ••• are non-zero. Thus the series 

00 

A ( x ) = &(*) 
i=i 

converges trivially. Because each x 6 A has a neighborhood on which A(x) 
equals a finite sum of C°° functions, A(x) is of class C°° . Finally, A(x) > 0 
for each x E A; given x, there is a rectangle Qi whose interior contains x, 
whence ^» ( x ) > 0. We now define 

M*) = ^»(x)/A(x); 


the functions <f>i satisfy all of the conditions of our theorem. □ 
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Conditions (1) and (4) imply that, for each x € A, the numbers <£,(x) 
actually “partition unity,” that is, they express the unity element 1 as a sum of 
non-negative numbers. The local finiteness condition (3) has the consequence 
that for any compact set C contained in A, there is an open set about C 
on which <j>i vanishes identically except for finitely many i. To find such an 
open set, one covers C by finitely many neighborhoods, on each of which fa 
vanishes except for finitely many i; then one takes the union of this finite 
collection of neighborhoods. 


EXAMPLE 1. Let / : R — ► R be defined by the equation 
( (1 + cos x)/2 for -ir < x < 7C t 

m = \ 

L 0 otherwise. 

Then / is of class C 1 . For each integer m > 0, set <jf> 2 m+ i(z) = /(* ~ mir). 
For each integer m > 1, set <f> ? m (x) = f{x + mi r). Then the collection {0,} 
forms a partition of unity on R. The support Si of <f>i is a closed interval of the 
form [fc 7 T, (fc + 2)7r], which is compact, and each point of R has a neighborhood 
that intersects at most three of the sets Si. We leave it to you to check that 
^2<(>i(x) — 1. Thus {<f> t } is a partition of unity on R. See Figure 16.4. 



Figure 16.4 


Now we explore the connection between partitions of unity and the ex- 
tended integral. We need a preliminary lemma: 

Lemma 16.4. Let A be open in R n ; let f : A — ► R be continuous. 
If f vanishes outside the compact subset C of A, then the integrals f A f 
and f c f exist and are equal. 
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Proof. The integral f c f exists because C is bounded, and the function 
fc> which equals f on A and vanishes outside C, is continuous and bounded 
on all of R n . 

Let Ci be a sequence of compact rectifiable sets whose union is A , such 
that Ci C Int Ci+i for each i. Then C is covered by finitely many sets Int C,-, 
and hence by one of them, say Int Cm- Since / vanishes outside C, 



f 


for all N > M. Applying this fact to the function |/| shows that lim/ Cw |/| 
exists, so that / is integrable over A ; applying it to / shows that f„ / = 
/ = /*/• □ 


Theorem 16.5. Let A be open in R n ; let f : A — * R be continuous. 
Let {</>,} be a partition of unity on A having compact supports. The 
integral f A f exists if and only if the series 


converges; in this case , 





Note that the integral f A <j>if exists and equals the ordinary integral 
fsi faf ( w here Si = Support <f>i) by the preceding lemma. 

Proof. We consider first the case where / is non-negative on A. 

Step 1 . Suppose / is non-negative on A, and suppose the series 
£[/* <£*/] converges. We show that f A f exists and 

/ /<£[/ 

J A ,. =1 Ja 

Let D be a compact rectifiable subset of A . There exists an M such that for 
all i > M, the function fa vanishes identically on D. Then 

M 

/( x ) = 5Z^‘( x )^( x ) 
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for x £ D. We conclude that 
M 


f / = £[/ M 

JD JD 

M r 

<£(/ <t*f] 

JDUS i 


M 


by linearity, 
by monotonicity, 
by the preceding lemma, 


= £[/*/] 

OO * 

<£[/ */]■ 

It follows that / is integrable over A, and 

/ /<£[/ */]. 

Ja “7 Ja 

Step 2. Suppose / is non-negative on A, and suppose / is integrable 
over j 4. We show the series 4>if] converges, and 

pL* n< -L'- 

Given N, the set D = Si U • • • U S N is compact. Furthermore, for 
i = 1, . . . , N, the function fcf vanishes outside D , so that 

/ 4>iS = [ M 

Ja Jd 

by the preceding lemma. We conclude that 
N . N 


£[/ */] = £[/ M 

i=l t=l '* D 

■ L { P ,n 


by linearity, 

: 1 

< / / by the comparison property, 

Jd 
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Thus the series ^2[f A <f>if] converges because its partial sums are bounded, 
and its sum is less than or equal to J A f. 

The theorem is now proved for non-negative functions /. 

Step 3. Consider the case of an arbitrary continuous function / : A — ► 
R. By Theorem 15.2, the integral f A f exists if and only if the integral f A \ f\ 
exists, and this occurs if and only if the series 



converges, by Steps 1 and 2. 

On the other hand, if f A f exists, then 


J f = J U-J f- 


by definition, 




by Steps 1 and 2, 


o° . 

= I / h y linearit y> 

*=i Ja 


since convergent series can be added term-by-term. □ 


EXERCISES 

1. Prove that the function / of Lemma 16.1 is of class C°° as follows: Given 
any integer n > 0, define f n : R — ► R by the equation 

, , , / ( e~ llx )lx n for x > 0, 

fn{ X ) = < 

^ 0 for x < 0. 

(a) Show that /„ is continuous at 0. [Hint: Show that a < e a for all a. 
Then set a = t/2n to conclude that 

t n (2n) n 
e* < e*/ 2 ’ 

Set t = l/x and let x approach 0 through positive values.] 

(b) Show that /„ is differentiable at 0. 

(c) Show that x ) = /n+ 2 (z) - nf n+1 (x ) for all x. 

(d) Show that /„ is of class C°°. 
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2. Show that the functions defined in Example 1 form a partition of unity 
on R. [Hint: Let f m (x) = f(x - mir), for all integers m. Show that 
Ylhm{x) ~ (l cos x)/2. Then find ^ fim+i (*)•] 

3. (a) Let S be an arbitrary subset of R n ; let Xo G 5. We say that the func- 

tion / : S — ► R is differentiable at xo, of class C r , provided there 
is a C r function g : U — *■ R defined in a neighborhood U of xo in 
R n , such that g agrees with / on the set U fl S. In this case, show 
that if (f) : R” — ■*> R is a C r function whose support lies in U , then 
the function 

f <p(x)g(x) for x E U, 
h(x) = < 

L 0 for x f. Support <f>, 

is well-defined and of class C r on R”. 

(b) Prove the following: 

Theorem. If f : S -+ R and f is differentiable of class C r 
at each point x 0 of S, then f may be extended to a C r function 
h : A r that is defined on an open set A of R n containing S. 

[Hint: Cover S by appropriately chosen neighborhoods, let A be 
their union, and take a C°° partition of unity on A dominated by 
this collection of neighborhoods.] 


§17. THE CHANGE OF VARIABLES THEOREM 


Now we discuss the general change of variables theorem. We begin by review- 
ing the version of it used in calculus; although this version is usually proved 
in a first course in single- variable analysis, we reprove it here. 

Recall the common convention that if / is integrable over [a, 6], then one 
defines 

/=- 

Theorem 17.1 (Substitution rule). Let I = [a, b]. Let g : I — ► R 
be a function of class C 1 , with g'(x) -fc 0 for x E (a,b). Then the set 
g(I) is a closed interval J with end points g(a) and g(b). If f : J —> R 
is continuous, then 

9(b) r b 
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or equivalently, 

jf/ = 

Proof. Continuity of g' and the intermediate-value theorem imply that 
either g'(x) > 0 or g'(x) < 0 on all of (a, b). Hence g is either strictly 
increasing or strictly decreasing on I, by the mean-value theorem, so that g 
is one-to-one. In the case where g' > 0, we have g(a) < g(b)\ in the case 
where g 1 < 0, we have g(a) > g{b). In either case, let J — [c,d\ denote the 
interval with end points g(a) and g(b). See Figure 17.1. The intermediate- 
value theorem implies that g carries I onto J. Then the composite function 
/(#(*)) is defined for all x in [a, 6], so the theorem at least makes sense. 




Figure 17.1 


Define 

F(y) = £f 

for y in [ c,d\ . Because / is continuous, the fundamental theorem of calculus 
implies that F'(y ) = f(y). Consider the composite function h(x) = F(g(x))\ 
we differentiate it by the chain rule. We have 

h’(x) = F'(g(x))g , (x) = f(g(x))g'(x). 

Because the latter function is continuous, we can apply the fundamental the- 
orem of calculus to integrate it. We have 

/ f(9(x))g'(x) = h(b) - h(a) 


= F(g(b)) - F(g(a)) 
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Now c equals either g(a) or g(b). In either case, this equation can be written 
in the form 

rb rg(b) 

(*) / ( f°g)9'= / /• 

Ja Jg{°) 

This is the first of our desired formulas. 

Now in the case where g* > 0, we have J = \g{d)<,g(b)]. Since |<jr'| = g' 
in this case, equation (*) can be written in the form 

(**) j(f° 9)\9'\ = Jjf- 

In the case where gf < 0, we have J = [<?(&)? <7 Since = — g r in this 
case, equation (*) can again be written in the form (**). □ 


EXAMPLE 1. Consider the integral 



x 2 + l) l0 (4x). 


Set f(y ) = y 10 and g(x) = 2x 7 + 1. Then g’(x) = 4x, which is positive for 
0 < x < 1. See Figure 17.2. The substitution rule implies that 



+ l) 10 (4z) - 



/(P(z))0'(*) 


rv - 3 rv=s 

/ m = / 

J u=l «7y= 1 


.10 




Figure 17.2 


Figure 17.3 
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EXAMPLE 2. Consider the integral 

r' w - y y'\ 

jy =- 1 

In calculus one proceeds as follows: Set y = g(x ) = sin a: for — tt/2 < x < 
tt/2. Then g'(x) = cos x , which is positive on ( — tt/2 , tt/ 2) and satisfies 
the conditions g(-x/2) = -1 and g(n/2) = 1. See Figure 17.3. If f(y) is 
continuous on the interval [—1, 1], then the substitution rule tells us that 

rl rn/2 

/ /==/ (fog)g’. 

J - 1 J —n/2 


Applying this rule to the function f(y) = l/(l — y 2 ) lf2 , we have 



Thus the problem seems to be solved. 

However, there is a difficulty here. The substitution rule does not apply in 
this case, for the function f(y) is not continuous on the interval — 1 < y < 1 ! 
The integral of / is in fact an improper integral, since / is not even bounded 
on the interval (—1,1). 

As indicated earlier, we shall generalize the substitution rule to n-dimen- 
sional integrals, and we shall prove it for the extended integral rather than 
merely for the ordinary integral. One reason is that the extended integral is 
actually easier to work with in this context than the ordinary integral. The 
other is that even in elementary problems one often needs to use the substi- 
tution rule in a situation where Theorem 17.1 does not apply, as Example 2 
shows. 

If we are to generalize this rule, we need to determine what a “substi- 
tution” or a “change of variables” is to be, in an n-dimensional extended 
integral. It is the following: 

Definition. Let A be open in R n . Let g : A — * R n be a one-to-one 
function of class C r , such that det Dg(x) ^ 0 for x G A. Then g is called a 
change of variables in R n . 

An equivalent notion is the following: If A and B are open sets in R n 
and if g : A — ► B is a one-to-one function carrying A onto B such that both 
g and g~~ l are of class C r , then g is called a diffeomorphism (of class C r ). 
Now if g is a diffeomorphism, then the chain rule implies that Dg is non- 
singular, so that det Dg ^ 0; thus g is also a change of variables. Conversely, 
if g : A — *■ R n is a change of variables in R n , then Theorem 8.2 tells us that 
the set B = #(>1) is open in R n and the function g~ l : B — ► A is of class 
C r . Thus the terms “diffeomorphism” and “change of variables” are different 
terms for the same concept. 

We now state the general change of variables theorem: 



148 Change of Variables 


Chapter 4 


Theorem 17.2 (Change of variables theorem). Let g : A — ► B 
be a diffeomorphism of open sets in R”. Let f : B —+ R be a contin- 
uous function. Then f is integrable over B if and only if the function 
if ° <7)| det Dg\ is integrable over A; in this case , 

f / = / (/ o^)| det Dg\. 

JB JA 

Note that in the special cause n — 1, the derivative Dg is the 1 by 1 matrix 
whose entry is g' . Thus this theorem includes the classical substitution rule 
as a special case. It includes more, of course, since the integrals involved 
are extended integrals. It justifies, for example, the computations made in 
Example 2. 

We shall prove this theorem in a later section. For the present, let us 
illustrate how it can be used to justify computations commonly made in mul- 
tivariable calculus. 

EXAMPLE 3. Let B be the open set in R 2 defined by the equation 

B = {(x, y) | x > 0 and y > 0 and x 2 -\-y 2 <a 2 }. 

One commonly computes an integral over B, such as f g x 2 y 2 , by the use of the 
polar coordinate transformation. This is the transformation g : R 2 — ► R 2 
defined by the equation 

g(r,9) = (rcosfl, r sin#). 

One checks readily that det Dg(r,8) = r, and that the map g carries the open 
rectangle 

A = {(r, 6) [ 0 < r < a and 0 < 6 < 7r/2} 

in the (r, 9) plane onto B in a one-to-one fashion. Since det Dg = r > 0 on 
A, the map g : A — * B is a diffeomorphism. See Figure 17.4. 



Figure 17.4 
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The change of variables theorem implies that 

f x 2 y 2 = f (r cos 0) 2 (r sin 0) 2 r; 

Jb J a 

since the latter exists as an ordinary integral as well as an extended integral, 
it can be evaluated (easily) by use of the Fubini theorem. 

EXAMPLE 4. Suppose we wish to integrate the same function X 2 y 2 over the 
open set 

W = {(x,y)\x 2 + y 2 < a 2 }. 

Here the use of polar coordinates is a bit more tricky. The polar coordinate 
transformation g does not in this case define a diffeomorphism of an open set 
in the (r, 6) plane with W. However, g does define a diffeomorphism of the 
open set U = (0,a) x (0, 2ir) with the open set 

V = {(x, y) | x 2 + y 2 < a 2 and x < 0 if y = 0} 

of R 2 . See Figure 17.5; the set V consists of W with the non-negative x-axis 
deleted. Because the non-negative x-axis has measure zero, 

x 2 y 2 = / x 2 y 2 . 

Jv 

The latter can be expressed as an integral over U, by use of the polar coordi- 
nate transformation. 




Figure 17.5 
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EXAMPLE 5. Let B be the open set in R 3 defined by the equation 

B — {(z, y } z) | x > 0 and y > 0 and x 2 + y 2 + z 2 < a 2 }. 

One commonly evaluates an integral over B, such as J B X 2 z, by the use of 
the spherical coordinate transformation, which is the transformation g : 
R 3 — ► R 3 defined by the equation 

g(p , 0, 9) = (/?sin 0cos 9, ps\n(f>sin 9, p cos <f>). 

Now det Dg = p 2 sin <f), as you can check. Thus det Dg is positive if 0 < <f> < 7T 
and p / 0. The transformation g carries the open set 

A = {(p, <f), 9) | 0 < p < a and 0 < <f) < 'K and 0 < 9 < x/2} 

in a one-to-one fashion onto B, as you can check. See Figure 17.6. Since 

det Dg > 0 on A, the change of variables theorem implies that 



Figure 17.6 
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EXERCISES 


1. Check the computations made in Examples 3 and 5. 

2. If 

V — {(x,y, z) | x 2 +y 2 + z 2 < a 2 and * > 0}, 

use the spherical coordinate transformation to express J y z as an integral 
over an appropriate set in (/9, 0, 9) space. Justify your answer. 

3. Let U be the open set in R 2 consisting of all x with ||x|| < 1. Let 
f{x,y) — l/(x 2 +J/ 2 ) for ( x,y ) / 0. Determine whether / is integrable 
over U — 0 and over R 2 — £/; if so, evaluate. 

4. (a) Show that 

/ e-^ 3 > = [/ e-* 3 ] 2 , 

J r 2 Jr 

provided the first of these integrals exists. 

(b) Show the first of these integrals exists and evaluate it. 

5. Let B be the portion of the first quadrant in R 2 lying between the hyper- 
bolas xy — 1 and xy — 2 and the two straight lines y — x and y = Ax. 
Evaluate f B x 2 y 3 . [Hint: Set x = ufv and y = ut>.] 

6. Let S be the tetrahedron in R 3 having vertices (0,0,0), (1,2,3), (0,1,2), 
and (-1,1,1). Evaluate f s /, where f(x, y, z) = x + 2y - z. [Hint: Use 
a suitable linear transformation g as a change of variables.] 

7. Let 0 < a < b. If one takes the circle in the xz-plane of radius a centered 
at the point (6,0,0), and if one rotates it about the z-axis, one obtains 
a surface called the torus. If one rotates the corresponding circular disc 
instead of the circle, one obtains a 3-dimensional solid called the solid 
torus. Find the volume of this solid torus. See Figure 17.7. [Hint: One 
can proceed directly, but it is easier to use the cylindrical coordinate 
transformat ion 


g(r , 6, z) = (rcos#, rsin 9 , z ). 

The solid torus is the image under g of the set of all (r, 9, z) for which 
(r — b ) 2 + z 2 < a 2 and 0 < 9 < 27T.] 



Figure 17.7 
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§18. DIFFEOMORPHISMS IN R" 


In order to prove the change of variables theorem, we need to obtain some 
fundamental properties of diffeomorphisms. This we do in the present section. 
Our first basic result is that the image of a compact rectifiable set under a 
diffeomorphism is another compact rectifiable set. And the second is that any 
diffeomorphism can be broken up locally into a composite of diffeomorphisms 
of a special type, called “primitive diffeomorphisms.” 

We begin with a preliminary lemma. 

Lemma 18.1. Let A be open in R n ; let g : A -*• R" be a function 
of class C 1 . If the subset E of A has measure zero in R", then the set 
g(E ) also has measure zero in R”. 

Proof. Step 1. Let e, 6 > 0. We first show that if a set S has measure 
zero in R n , then S can be covered by countably many closed cubes , each of 
width less than 6 , having total volume less than e. 

To prove this fact, it suffices to show that if Q is a rectangle 

Q = [fli,6i] x ••• x [a„, b n ] 

in R n , then Q can be covered by finitely many cubes, each of width less than 
8, having total volume less than 2v(Q). Choose A > 0 so that the rectangle 

Q\ — [®i — A, b\ -+■ A] x • • ■ x [g^, A, b n + A] 


has volume less than 2 v(Q). 

Then choose N so that l/N is less than the smaller of 8 and A. Consider 
all rational numbers of the form m/N, where m is an arbitrary integer. Let 
Cj be the largest such number for which C{ < a*, and let d,- be the smallest 
such number for which d, > 6 t . Then [«, •,&,•] C [c»,d t ] C [«» — A, 6,- + A]. See 
Figure 18.1. Let Q' be the rectangle 

— [ci, di] x • • ■ x [c n , d„], 

which contains Q and is contained in Q\. Then v(Q') < 2v(Q). Each of 
the component intervals [c,-,d,] of Q 1 is partitioned by points of the form 
m/N into subintervals of length l/N . Then Q' is partitioned into subrectan- 
gles that are cubes of width l/N (which is less than £); these subrectangles 
cover Q. By Theorem 10.4, the total volume of these cubes equals v(Q'). 
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Figure 18.1 


Step 2. Let C be a closed cube contained in A. Let 
|D0(x)| < M for xeC. 

We show that if C has width w , then g(C ) is contained in a closed cube in 
R n of width (nM )w. 

Let a be the center of (7; then C consists of all points x of R n such that 
|x — a| < wf 2. Now the mean-value theorem implies that given x E C, there 
is a point c ; on the line segment from a to x such that 

0j( x ) ~ 9j( a ) = D 9j(cj) ’ ( x - a). 

Then 

!&( x )-0;( a )l < n \Dgj(cj)\ • | x — a| < nM(w/ 2). 

It follows from this inequality that if x £ C, then g(x) lies in the cube 
consisting of all y £ R n such that 

|y -tf(a)| < nM(w/ 2). 

This cube has width ( nM)w , as desired. 

Step 3. Now we prove the theorem. Suppose E is a subset of A and E 
has measure zero. We show that g(E) has measure zero. 

Let C, be a sequence of compact sets whose union is A, such that Ci C 
Int (7*4-1 for each i. Let Ek = Ck H E\ it suffices to show that g(Ek) has 
measure zero. Given e > 0, we shall cover g(Ek) by cubes of total volume 
less than e. 

Since Ck is compact, we can choose 8 > 0 so that the ^-neighborhood of 
C k (in the sup metric) lies in Int Cjt+i, by Theorem 4.6. Choose M so that 

\Dg(x)\ < M for xeC k + 1 - 

Using Step 1, cover E k by countably many cubes, each of width less than 6, 
having total volume less than 


e' = 
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Let D i, Dz, ... denote those cubes that actually intersect Ek . Because Di 
has width less than 8 , it is contained in Ck+i- Then |i}<7(x)| < M for x E Di, 
so that g(Di) lies in a cube D[ of width nM(width Di), by Step 2. The cube 
D'i has volume 


v(D'i) = (nM) n ( width Di) n = ( nM) n v(Di ). 

Therefore the cubes D which cover g(Ek), have total volume less than 
(nM)”e / = e, as desired. See Figure 18.2. □ 

EXAMPLE 1. Differentiability is needed for the truth of the preceding 
lemma. If g is merely continuous, then the image of a set of measure zero 
need not have measure zero. This fact follows from the existence of a contin- 
uous map / : [0,1] — ► [0, l] 2 whose image set is the entire square [0,1] 2 ! It 
is called the Peano space-filling curve ; and it is studied in topology. (See 
[M], for example.) 

Theorem 18 . 2 . Let g : A — ► B be a diffeomorphism of class C r , 
where A and B are open sets in R". Let D be a compact subset of A, 
and let E — g(D). 

(a) We have 

< 7 (Int D) — Int E and #(Bd D) = Bd E. 

(b) If D is rectifiable, so is E. 
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Proof, (a) The map g~ l is continuous. Therefore, for any open set U 
contained in A, the set g(U) is an open set contained in B. In particular, 
<?(Int D) is an open set in R” contained in the set g(D) = E. Thus 

(1) fif(Int D) C Int E . 

Similarly, g carries the open set (Ext D)C\A onto an open set contained in B. 
Because g is one-to-one, the set ^((Ext D) fl A) is disjoint from g(D) = E. 
Thus 

(2) <?((Ext D) fl A) c Ext E. 

It follows that 

(3) g( Bd D) D Bd E. 

For let y E Bd E; we show that y E #(Bd D). The set E is compact, since D 
is compact and g is continuous. Hence E is closed, so it must contain its 
boundary point y. Then y E B. Let x be the point of A such that < 7 (x) = y. 
The point x cannot lie in Int D, by (1), and cannot lie in Ext D, by (2). 
Therefore x E Bd D, so that y E #(Bd D ), as desired. See Figure 18.3. 



Figure 18.3 


Symmetry implies that these same results hold for the map g~ l : B — ► A. 
In particular, 

(i') 

(3') 


g ’(Int E) C Int D, 
<T’(Bd E) D Bd D. 
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Combining (1) and (1/) we see that <7(Int D ) = Int E\ combining (3) and (3^) 
gives the equation $f(Bd D) = Bd E. 

(b) If D is rectifiable, then Bd D has measure zero. By the preceding 
lemma, #(Bd D ) also has measure zero. But fif(Bd D ) = Bd E. Thus E is 
rectifiable. □ 


Now we show that an arbitrary diffeomorphism of open sets in R n can 
be “factored” locally into diffeomorphisms of a certain special type. This 
technical result will be crucial in the proof of the change of variables theorem. 

Definition. Let h : A — ► B be a diffeomorphism of open sets in R" 
(where n > 2), given by the equation 

h(x) = (Mx), • M*))- 

Given i, we say that h preserves the i th coordinate if /i;(x) = for 
all x G A. If h preserves the z th coordinate for some i, then h is called a 
primitive diffeomorphism. 

Theorem 18.3. Let g : A B be a diffeomorphism of open sets 

in R n , where n > 2. Given a E A, there is a neighborhood U 0 of a 

contained in A, and a sequence of diffeomorphisms of open sets in R n , 

Uq U 1 U 2 * ' • ’ Uk j 

such that the composite hk o • ■ ■ o h 2 o h\ equals g\Uo, and such that each 

hi is a primitive diffeomorphism. 

Proof Step 1. We first consider the special case of a linear transfor- 
mation. Let T : R n — *• R n be the linear transformation T(x) = C • x, where 
C is a non-singular n by n matrix. We show that T factors into a sequence 
of primitive non-singular linear transformations. 

This is easy. The matrix C equals a product of elementary matrices, by 
Theorem 2.4. The transformation corresponding to an elementary matrix may 
either (1) switch two coordinates, or (2) replace the z th coordinate by itself plus 
a multiple of another coordinate, or (3) multiply the z th coordinate by a non- 
zero scalar. Transformations of types (2) and (3) are clearly primitive, since 
they leave all but the z th coordinate fixed. We show that a transformation of 
type (1) is a composite of transformation of types (2) and (3), and our result 
follows. Indeed, it is easy to check that the following sequence of elementary 
operations has the effect of exchanging rows i and j: 
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Row i 

Row j 

Initial state 

a 

b 

Replace (row i) by (row i) — (row j) 

a— b 

b 

Replace (row j) by (row j) 4- (row i) 

a— b 

a 

Replace (row i) by (row i) — (row j) 

-b 

a 

Multiply (row i) by —1 

b 

a 


Step 2. We next consider the case where g is a translation. Let t : R” 
R n be the map t(x) = x + c. Then t is the composite of the translations 

*i(x) = x + (0, c 2 , . c n ), 

M x ) = x+ (ci, 0, 0), 

both of which are primitive. 


Step 3. We now consider the special case where a = 0 and g(0) = 0 
and Dg{ 0) = We show that in this case, g factors locally as a composite 
of two primitive diffeomorphisms. 

Let us write g in components as 

0(x) = (flfi(x), ...,g n (x)) = (gi(x u x„), ...,g n (x u x n )). 

Define h : A — *■ R" by the equation 

M x ) = , 9n- i(x), x n ). 


Now h(0) = 0, because = 0 for all i; and 


D/i(x) 


9n-i)/dx m 

0 ... 01 

J 


Since the matrix ^(^i, . . . , £f n _i)/^x equals the first n— 1 rows of the matrix 
Dg, and ^^(O) = we have Dh( 0) = I n . It follows from the inverse 
function theorem that h is a diffeomorphism of a neighborhood Vo of 0 with 
an open set V\ in R n . See Figure 18.4. Now we define k : Vi —> R n by the 
equation 

Ky) = (vu y n -ugn(h~ l (y))). 

Then k(0) = 0 (since h~ l ( 0) = 0 and <7„(0) = 0). Furthermore, 

/n-l 0 

D(g n oh~ l )(y)_ 


Dk{ y) - 
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Applying the chain rule, we compute 

D(g„ oh- 1 )(0) = Dg„(0)-Dh- 1 (0) 

= Dg n (0) • [DMO)]- 1 

= [0 • • • 0 1 ] • /„ = [0 ■ ■ ■ 0 1 ]. 

Hence Dk( 0) — I n . It follows that k is a diffeomorphism of a neighborhood 
W\ of 0 in R n with an open set W 2 in R”. 

Now let Wo = h~ 1 (Wi). The diffeomorphisms 

Wo w 2 

are primitive. Furthermore, the composite koh equals <?|Wo, as we now show. 
Given x £ Wo, let y = h(x). Now 

(*) y = (fifi(x), ..., g n -i(x),x n ) 

by definition. Then 

My) = (Vu ■■■,y n -u9n(h~ 1 (y))) by definition, 

= (fifi(x), . . . , fif n -i(x), flr„(x)) by (*), 

= *(*)• 
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Figure 18.5 


Step 4* Now we prove the theorem in the general case. 

Given g : A —+ B , and given a £ A, let C be the matrix Dg{ a). Define 
diffeomorphisms t\,t 2 ,T : R" — *■ R n by the equations 

<i(x) = x + a and ^(x) = x — < 7 (a) and T(x) = C -1 ■ x. 

Let g equal the composite T o t 2 o g o fj. Then g is a diffeomorphism of the 
open set of R” with the open set T(t 2 (B)) of R”. See Figure 18.5. It 

has the property that 

^(0) — 0 and Dg(0) = /„; 

the first equation follows from the definition, while the second follows from 
the chain rule, since DT( 0) = C" 1 and Dt{ = I n for i = 1,2. 

By Step 3, there is an open set Wo about 0 contained in /^(A) such 
that g\W 0 factors into a sequence of (two) primitive diffeomorphisms. Let 
W 2 = g(W 0 ). Let 

Ao-t^Wo) and Bo = t^T~\W 2 ). 

Then g carries Ao onto B 0 , and g\A 0 equals the composite 

A 0 51. Wo -1* W 2 — T~ l (W 2 ) Bo. 

By Steps 1 and 2, each of the maps t^ 1 and /J 1 and T -1 factors into primitive 
transformations. The theorem follows. □ 
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EXERCISES 

1. (a) If / : R 2 — *• R 1 is of class C 1 , show that / is not one-to-one. [Hint: 
If Df(x) = 0 for all x, then / is constant. If Df(x o) ^ 0, apply the 
implicit function theorem.] 

(b) If / : R 1 — * ► R 2 is of class C 1 , show that / does not carry R 1 onto R 2 . 
In fact, show that /( R 1 ) contains no open set of R 2 . 

* 2 . Prove a generalization of Theorem 18.3 in which the statement “h is 
primitive” is interpreted to mean that h preserves all but one coordinate. 
[Hint: First show that if a=0 and #(0) = 0 and Dg( 0) = then g can 
be factored locally as k o h, where 

/i(x) = (fifi(x), ..., 0,_i(x),a?,,0 t+ i(x), Pn(x)) 

and k preserves all but the i tl1 coordinate; and furthermore, h(0) = 
k( 0) = 0 and D/i(0) = Dk{ 0) = I n ■ Then proceed inductively.] 

3. Let A be open in R m ; let g : A — ► R". If 5 is a subset of A, we say that g 
satisfies the Lipschitz condition on S if the function 

A(x,y) = I p(x) — g{y)\ / |x — y | 

is bounded for x,y in S and x ^ y. We say that g is locally Lipschitz 
if each point of A has a neighborhood on which g satisfies the Lipschitz 
condition. 

(a) Show that if g is of class C 1 , then g is locally Lipschitz. 

(b) Show that if g is locally Lipschitz, then g is continuous. 

(c) Give examples to show that the converses of (a) and (b) do not hold. 

(d) Let g be locally Lipschitz. Show that if C is a compact subset of A, 
then g satisfies the Lipschitz condition on C. [Hint: Show there is a 
neighborhood V of the diagonal A in C x C such that X is bounded 
on V - A.] 

4. Let A be open in R n ; let g : A — ► R n be locally Lipschitz. Show that if 
the subset E of A has measure zero in R n , then g(E ) has measure zero 
in R". 

5. Let A and B be open in R n ; let g : A — * B be a one-to-one map carrying A 
onto B. 

(a) Show that (a) of Theorem 18.2 holds under assumption that g and 
g~ l are continuous. 

(b) Show that (b) of Theorem 18.2 holds under the assumption that g is 
locally Lipschitz and g~ l is continuous. 
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§19. PROOF OF THE CHANGE OF VARIABLES THEOREM 


Now we prove the general change of variables theorem. We prove first the 
“only if’ part of the theorem. It is stated in the following lemma: 

Lemma 19.1. Let g : A — ► B be a diffeomorphism of open sets in 
R". Then for every continuous function f : B — ► R that is integrable 
over B, the function (f o g)\ det Dg\ is integrable over A, and 

[ / = / (/ ° g)\ det Dg\. 

JB JA 


Proof. The proof proceeds in several steps, by which one reduces the 
proof to successively simpler cases. 

Step 1. Let g : U — ► V and h : V — ► be diffeomorphisms of open 

sets in R". We show that if the lemma holds for g and for h, then it holds for 
hog. 

Suppose / : W — ► R is a continuous function that is integrable over W. 
It follows from our hypothesis that 

f f=[ (/ o h)\ det Dh\ = f (/ o h o #)|(det Dh) o g\ | det Dg\; 

Jw Jv Ju 

the second integral exists and equals the first integral because the lemma holds 
for h\ and the third integral exists and equals the second integral because the 
lemma holds for g. In order to show that the lemma holds for hog , it suffices 
to show that 

|(det Dh) o g\ \ det Dg\ — \det D(h o g)\. 

This result follows from the chain rule. We have 

D(h o £)(x) = Dh(g(x)) • ^(x), 


whence 

det D(h o g) = [(det Dh) o g] ■ [det Dg], 

as desired. 

Step 2. Suppose that for each x G A, there is a neighborhood U of x 
contained in A such that the lemma holds for the diffeomorphism g : U — ► V 
(where V = g(U)) and all continuous functions / : V — ► R whose supports 
are compact subsets of V. Then we show that the lemma holds for g. 

Roughly speaking, this statement says that if the lemma holds locally for 
g and functions / having compact support, then it holds for g and all /. 
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This is the place in the proof where we use partitions of unity. Write A 
as the union of a collection of open sets U Q such that if V a = g(U a ), then 
the lemma holds for the diffeomorphism g : U Q —*■ V a and all continuous 
functions / : V Q — » ► R whose supports are compact subsets of V a . The union 
of the open sets V a equals B. Choose a partition of unity {(pi) on B, having 
compact supports, that is dominated by the collection {V^}. We show that 
the collection {<f>i o < 7 } is a partition of unity on A, having compact supports. 
See Figure 19.1. 



Figure 19.1 


First, we note that (pi(g{x)) > 0 for x E A. Second, we show (pi o g has 
compact support. Let T, = Support (pi. The set g~ l (T{) is compact because 
Ti is compact and g _1 is continuous; furthermore, (pi o g vanishes outside 
g~ l (Ti). The closed set Si = Support {(pi o g) is contained in g~ l {Ti ), so 
that Si is compact. Third, we check the local finiteness condition. Let x be 
a point of A. The point y = g(x.) has a neighborhood W that intersects Tj 
for only finitely many values of i. Then the set g~ l {W) is an open set about 
x that intersects Si for at most these same values of i. Fourth, we note that 

= L 

Thus {< pi o g} is a partition of unity on A. 

Now we complete the proof of Step 2. Suppose / : B — * R is continuous 
and / is integrable over B. We have 
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by Theorem 16.5. Given i, choose a so that T,- C V a . The function <pif is 
continuous on B and vanishes outside the compact set T t -. Then 


/ 4>if = f <t>if = f 4>if, 

JB JTi Jv a 

by Lemma 16.4. Our lemma holds by hypothesis for g : U a —*■ V a and the 
function <Pif. Therefore 


/ M = / 
Jv a Ju c 


(<t>i ° 9) if ° 9 ) I det Dg |. 


Since the integrand on the right vanishes outside the compact set Si, we can 
apply Lemma 16.4 again to conclude that 


[ <j)if = [ (<f>i og)(fo 0 )| det Dg\. 

JB JA 

We then sum over i to obtain the equation 


(*) 


f /= [[(<f>iog)(f O 0 )| det Dg\]. 

b i=1 Ja 


Since |/| is integrable over B, equation (*) holds if / is replaced throughout 
by | /|- Since {(pi o g) is a partition of unity on A, it then follows from 
Theorem 16.5 that (/ o g)\detDg\ is integrable over A. We then apply (*) 
to the function / to conclude that 

f f= /(/° 5 )|detPff|. 

JB JA 

Step 3. We show that the lemma holds for n = 1 . 

Let g : A — ► B be a diffeomorphism of open sets in R 1 . Given x E A, let / 
be a closed interval in A whose interior contains x ; and let J = g(I). Now J is 
an interval in R 1 and g maps Int I onto Int J. (See Theorems 17.1 and 18.2.) 
Since x is arbitrary, it suffices by Step 2 to prove the lemma holds for the 
diffeomorphism g : Int / — ► Int J and any continuous function / : Int J — ► R 
whose support is a compact subset of Int J . That is, we wish to verify the 
equation 

(**) / f= f (/°s)ls'|. 

J Int J J Int / 

This is easy. First, we extend / to a continuous function defined on J by 
letting it vanish on Bd J . Then (**) is equivalent to the equation 

Jjf = j(f 
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in ordinary integrals. But this equation follows from Theorem 17.1. 

Step 4- Let n > 1. In order to prove the lemma for an arbitrary diffeo- 
morphism g : A — ► B of open sets in R", we show that it suffices to prove it 
for a primitive diffeomorphism h : U — + V of open sets in R". 

Suppose the lemma holds for all primitive diffeomorphisms in R n . Let 
g : A — ► B be an arbitrary diffeomorphism in R". Given x 6 A, there exists 
a neighborhood Uq of x and a sequence of primitive diffeomorphisms 



whose composite equals g\Uo. Since the lemma holds for each of the diffeo- 
morphisms hi, it follows from Step 1 that it holds for g\Uo. Then because x 
is arbitrary, it follows from Step 2 that it holds for g. 

Step 5. We show that if the lemma holds in dimension n — 1, it holds 
in dimension n. 

This step completes the proof of the lemma. 

In view of Step 4, it suffices to prove the lemma for a primitive diffeomor- 
phism h : U — ► V of open sets in R”. For convenience in notation, we assume 
that h preserves the last coordinate. 

Let p e If; let q = /i(p). Choose a rectangle Q contained in V whose 
interior contains q; let *5* = By Theorem 18.2, the map h defines a 

diffeomorphism of Int S with Int Q. Since p is arbitrary, it suffices by Step 2 
to prove that the lemma holds for the diffeomorphism h : Int S — ♦ Int Q and 
any continuous function / : Int Q — ► R whose support is a compact subset of 
Int Q. See Figure 19.2. 



Figure 19.2 
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Now (/ o h) j det Dh\ vanishes outside a compact subset of Int S\ hence 
it is integrable over Int S by Lemma 16.4. We need to show that 

f f=[ (foh)\detDh\. 

Jlnt Q J Int 5 

This is an equation involving extended integrals. Since these integrals ex- 
ist as ordinary integrals, it is by Theorem 15.4 equivalent to the corresponding 
equation in ordinary integrals. 

Let us extend / to R" by letting it vanish outside Int Q , and let us define 
a function F : R” — ► R by letting it equal (/ oh)\ det Dh\ on Int S and vanish 
elsewhere. Then both / and F are continuous, and our desired equation is 
equivalent to the equation 

/ f = J F - 

JQ JS 

The rectangle Q has the form Q = D x /, where D is a rectangle in 
R” _1 and I is a closed interval in R. Since S is compact, its projection on the 
subspace R”" 1 x 0 is compact and thus contained in a set of the form E'xO, 
where E is a rectangle in R n_1 . Because h preserves the last coordinate, the 
set S is contained in the rectangle E x I. See Figure 19.3. 



Figure 19.3 


Because F vanishes outside S , our desired 


form 


J f = J F, 

JQ JExI 


equation can be written in the 
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which by the Fubini theorem is equivalent to the equation 

/ / /( y,0= / f F(x,t). 

Jtei Jy£D Jt£l Jx£E 

It suffices to show the inner integrals are equal. This we now do. 

The intersections of U and V with R n_1 x t are sets of the form U t x t 
and Vt x t , respectively, where Ut and V t are open sets in R" -1 . Similarly, 
the intersection of S with R" -1 x t has the form St x t, where St is a compact 
set in R n_1 . Since F vanishes outside S , equality of the “inner integrals” is 
equivalent to the equation 


f f(y,t)= f F(x,t ), 

JvtD JxGSt 


f y€D Jx£S t 

and this is in turn equivalent by Lemma 16.4 to the equation 


/ /( y»*)= / F(x,t). 

JytVt JxGUt 


This is an equation in (n — l)-dimensional integrals, to which the induction 
hypothesis applies. 

The diffeomorphism h : U —*V has the form 


/i(x, t) = (fc(x, 2), t) 


for some C 1 function k : U — ► R” l . The derivative of h has the form 


Dh = 


dk/dx dk/dt' 

0 0 1 


so that det Dh = det dk/dx. For fixed t, the map x — * k(x,t ) is a C 1 map 
carrying Ut onto V t in a one-to-one fashion. Because det dk/dx = det Dh / 
0, this map is in fact a diffeomorphism of open sets in R” -1 . 

We apply the induction hypothesis; we have, for fixed t , the equation 

/ f(y,t)= f f(k(x,t),t)\detdk/dx\. 

Jy£V t Jx£U t 

For x (E U t , the integrand on the right equals 

/ (/i(x, t )) | det Dh\ = F(x , t). 


The lemma follows. □ 

We now prove the “if’ part of the change of variables theorem. 
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Lemma 19.2. Let g : A — + B be a diffeomorphism of open sets in 
Ft"; let f : B — ► R be continuous. If (f o < 7 )| det Ztyj is integrable over A, 
then f is integrable over B. 

Proof. We apply the lemma just proved to the diffeomorphism g~ l : 
B —> A. The function F = (/ o (7)|detZ)<7| is continuous on A, and is 
integrable over A by hypothesis. It follows from Lemma 19.1 that the function 

(F o<jr -1 )|det Dg~ l \ 

is integrable over B. But this function equals /. For if < 7 (x) = y, then 

(D(g~ 1 ))(y) = [^(x )]- 1 


by Theorem 7.4, so that 

(Foff-‘)(y) ■ |(det •D(fl _1 ))(y)| = F(x) ■ |l/det £><?(x)| = /( y). □ 


EXAMPLE 1. If it happens that both integrals in the change of variables the- 
orem exist as ordinary integrals, then the theorem implies that these two ordi- 
nary integrals are equal. However, it is possible for only one of these integrals, 
or neither, to exist as an ordinary integral. Consider, for instance, Exam- 
ple 2 of §17. The change of variables theorem, applied to the diffeomorphism 
9 '• (-tt/ 2 > x / 2 ) -"(-1,1) given by g(x) = cos x , implies that 

/ 1/(1 - s , 2 ) 1 ' 2 = 

■'(- 1 . 1 ) 

Here the integral on the right exists as an ordinary integral, but the integral 
on the left does not. 


I 


1 . 


(-tr/2,^/2) 


EXERCISES 


1. Let A be the region in R 2 bounded by the curve x 2 — xy + 2 y 2 ~ 1. 
Express the integral f A xy as an integral over the unit ball in R 2 centered 
at 0. [Hint: Complete the square.] 

2. (a) Express the volume of the solid in R 3 bounded below by the surface 
z = x 2 + 2 y 2 , and above by the plane z — ‘lx + 6y -f 1, as the integral of 
a suitable function over the unit ball in R 2 centered at 0. 

(b) Find this volume. 

3. Let TTfc : R n ► R be the k th projection function, defined by the equa- 
tion 7Tfe(x) = Xk. Let 5 be a rectifiable set in R n with non-zero volume. 
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The centroid of S is defined to be the point c(S) of R" whose fc th coor- 
dinate, for each fc, is given by the equation 

c*(S) = [1/«(S)] Jt„. 

We say that S is symmetric with respect to the subspace Xk = 0 of R" 
if the transformation 

/i(x) = (# 1 , . . . , — l , Xk , > • • • » ^n) 

carries S onto itself. In this case, show that c*(5) = 0. 

4. Find the centroid of the upper half-ball of radius a in R 3 . (See Exercise 2 
of §17.) 

5. Let A be an open rectifiable set in R n— Given the point p in R n with 
p n > 0, let S be the subset of R n defined by the equation 

5 = {x | X = (1 — <)a + fp, where a (E A x 0 and 0 < t < 1}. 

Then S is the union of all open line segments in Rejoining p to points 
of A x 0; its closure is called the cone with base A x 0 and vertex p. 
Figure 19.4 illustrates the case n = 3. 

(a) Define a diffeomorphism g of A x (0, 1) with 5. 

(b) Find v(5) in terms of v(A). 

*(c) Show that the centroid c(S) of S lies on the line segment joining 
c(A) and p; express it in terms of c(A) and p. 


P 


A 

Figure 19.4 

*6. Let B n (a) denote the closed ball of radius a in R n , centered at 0. 
(a) Show that 

v(B n (a)) = X„a n 

for some constant A„. Then X n — v(B n (l)). 
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(b) Compute Ai and A 2 . 

(c) Compute A„ in terms of A„_ 2 . 

(d) Obtain a formula for A n . [Hint: Consider two cases, according as n 
is even or odd.] 

*7. (a) Find the centroid of the upper half-ball 

B+ (a) = {x | x E B n (a) and x n > 0} 

in terms of A n and A„_i and a, where A„ = u(5 n (l)). 

(b) Express c(B+(a)) in terms of c(B^ -2 (a)). 


§20. APPLICATIONS OF CHANGE OF VARIABLES 


The meaning of the determinant 

We now give a geometric interpretation of the determinant function. 

Theorem 20.1. Let A be an n by n matrix. Let h : R n — > R n be 
the linear transformation h(x) = A ■ x. Let S be a rectifiable set in R n , 
and let T = h(S). Then 

v(T) = | det A | • v(*S'). 


Proof. Consider first the case where A is non-singular. Then h is a dif- 
feomorphism of R n with itself; h carries Int S onto Int T; and T is rectifiable. 
We have 

v(T) = v(Int T)= [ 1 = f | det Dh\ 

Jlnt T J Int 5 

by the change of variables theorem. Hence 

v(T) = ! | det A | = | det A \ • v(S). 

J Int 5 

Consider now the case where A is singular; then det >1 = 0. We show 
that v(T) = 0. Since S is bounded, so is T . The transformation h carries R n 
onto a linear subspace V of R n of dimension p less than n , which has measure 
zero in R", as you can check. Then T is closed and bounded and has measure 
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zero in R n . The function ly is continuous and vanishes outside T; hence the 
integral f T 1 exists and equals 0. □ 


This theorem gives one interpretation of the number |detA |; it is the 
factor by which the linear transformation h(x ) = A • x multiplies volumes. 
Here is another interpretation. 

Definition. Let ai, a* be independent vectors in R n . We define 
the &- dimensional parallelopiped V = V{&\, . . . , a*) to be the set of all x 
in R n such that 

x = c x ai 4- 1- c k &k 

for scalars c t - with 0 < c* < 1. The vectors ai, . . . , a* are called the edges 
of P. 

A few sketches will convince you that a 2-dimensional parallelopiped is 
what we usually call a “parallelogram,” and a 3-dimensional one is what we 
usually call a “parallelopiped.” See Figure 20.1, which pictures parallelograms 
in R 2 and R 3 and a 3-dimensional parallelopiped in R 3 . 



We eventually wish to define what we mean by the “^-dimensional vol- 
ume” of a ^-parallelopiped in R n . In the case k = n, we already have a notion 
of volume, as defined in §14. It satisfies the following formula: 

Theorem 20.2. Let a l5 ..., a n be n independent vectors in R n . 
Let A — [ai ... a n ] be the n by n matrix with columns ai, . . . , a n . Then 

v(V(*i, a n )) = | det A |. 

Proof. Consider the linear transformation h : R n — * R n given by 
h(x) = A • x. Then h carries the unit basis vectors ei, ..., e n to the vec- 
tors ai, ..., a n , since A • ej = a ; by direct computation. Furthermore, h 
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carries the unit cube I n — [0, l] n onto the parallelopiped ^(ai, . . . , a„). By 
the preceding theorem, 

, a n )) = | det A \ • v(I n ) = | det A\. □ 


EXAMPLE 1. In calculus, one studies the 3-dimensional version of this for- 
mula. One learns that the volume of the parallelopiped with edges a, b, c is 
given (up to sign) by the “triple scalar product” 

a ■ (b x c) = det [a b c]. 

(We write a, b, and c as column matrices here, as usual.) One learns also that 
the sign of the triple scalar product depends on whether the triple a, b, c 
is “right-handed” or “left-handed.” We now generalize this second notion to 
R n , and indeed, to an arbitrary finite-dimensional vector space V. 

Definition. Let V be an n-dimensional vector space. An n-tuple 
(a 1} ...,a n ) of independent vectors in V is called an ra-frame in V. In 
R n , we call such a frame right-handed if 


det [ai • • • a n ] > 0; 

we call it left-handed otherwise. The collection of all right-handed frames in 
R n is called an orientation of R n ; and so is the collection of all left-handed 
frames. More generally, choose a linear isomorphism T : R n — ► V, and define 
one orientation of V to consist of all frames of the form (T’(ai), . . . , T(a n )) 
for which (ai , . . . , a n ) is a right-handed frame in R n , and the other orientation 
of V to consist of all such frames for which (ai, . . . , a n ) is left-handed. Thus 
V has two orientations; each is called the reverse, or the opposite, of the 
other. 

It is easy to see that this notion is well-defined (independent of the choice 
of T). Note that in an arbitrary n-dimensional vector space, there is no well- 
defined notion of “right-handed,” although there is a well-defined notion of 
orientation. 
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EXAMPLE 2. In R 1 , a frame consists of a single non-zero number; it is right- 
handed if it is positive, and left-handed if it is negative. In R 2 , a frame (ai , &?) 
is right-handed if one must rotate ai in a counterclockwise direction through 
an angle less than 7 r to make it point in the same direction as & 2 - (See the 
exercises.) In R 3 , a frame (ai,a 2 ,a 3 ) is right-handed if curling the fingers of 
one’s right hand in the direction from ai to a 2 makes one’s thumb point in 
the direction of a 3 . See Figure 20.2. 





One way to justify this statement is to note that if one has a frame 
(ai(t), a 2 (t), a 3 (<)) that varies continuously as a function of t for 0 < t < 1, 
and if the frame is right-handed when t — 0, then it remains right-handed for 
all t . For the function det [ai a 2 a 3 ] cannot change sign, by the intermediate- 
value theorem. Then since the frame (ei , e 2 , e 3 ) satisfies the “curled right- 
hand rule” as well as the condition det [ei e 2 e 3 ] > 0, so does the frame 
corresponding to any other position of the “curled right hand” in 3-dimensional 
space. 

We now obtain another interpretation of the sign of the determinant. 

Theorem 20.3. Let C be a non-singular n by n matrix. Let 
h : R n — * R n be the linear transformation h(x) = C • x. Let (ai, . . . , a n ) 
be a frame in R n . If det C > 0, the the frames 

(a, ...,a n ) and (/i(a! /t(a n )) 

belong to the same orientation of R n ; if det C < 0, they belong to op- 
posite orientations of R n . 

If det C > 0, we say h is orientation-preserving; if det C < 0, we 
say h is orientation-reversing. 

Proof. Let b* = /i(a* ) for each i. Then 

C • [ai • • • a^] = [bi • • * b n ]> 
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so that 

(det C) • det [ai • • • a n ] = det [bj • • • b n ]. 

If det C > 0, then det [ai • • • a n ] and det [bi • • • b n ] have the same sign; if 
det C < 0, they have opposite signs. □ 


Invariance of volume under isometries 

Definition. The vectors aj , . . . , a* of R n are said to form an orthog- 
onal set if (a t -, aj) = 0 for i ^ j. They form an orthonormal set if 
they satisfy the additional condition (a,-, a<) = 1 for all i. If the vectors 
ai, . . . , a* form an orthogonal set and are non-zero, then the vectors a x /||ai||, 

. . . , ajfc/||ajb || form an orthonormal set. 


An orthogonal set of non-zero vectors ai, . . . , a* is always independent. 
For, given the equation 


dia.1 + • • • + d^sik = 0 , 

one takes the dot product of both sides with a,- to obtain the equation 
di(a,-,a t } = 0, which implies (since a,- ^ 0) that d, = 0. 

An orthogonal set of non-zero vectors in R n that consists of n vectors is 
thus a basis for R n . The set ei, . . . , e n is one such basis for R n , but there are 
many others. 


Definition. An n by n matrix A is called an orthogonal matrix if 
the columns of A form an orthonormal set. This condition is equivalent to 
the matrix equation 

A tr - A = I n , 


as you can check. 

If A is orthogonal, then A is square and A tr is a left inverse for A; it 
follows that A tr is also a right inverse for A. Thus A is an orthogonal matrix 
if and only if A is non-singular and A tr = A~ l . 

Note that if A is orthogonal, then det A = ± 1. For 

(det .4) 2 = (det >l tr )(det .4) = det(.4 tr ■ A) = det /„ = 1. 

The set of orthogonal matrices forms what is called, in modern algebra, 
a group . That is the substance of the following theorem: 
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Theorem 20.4. Let A, B, C be orthogonal n by n matrices. Then: 

(a) A • B is orthogonal. 

(b) A (B -C) = (A B) C. 

(c) There is an orthogonal matrix I n such that A • /„ = /„ • A = A 
for all orthogonal A. 

(d) Given A, there is an orthogonal matrix A~ l such that A A~ l = 

A~ 1 -A = I n . 

Proof. To check (a), we compute 

(A • B) tr (A B) = ( B tr ■ A tr ) • ( A B ) 

= B tr B = I n . 

Condition (b) is immediate and (c) follows from the fact that I n is orthogonal. 
To check (d), we note that since A tr equals A~ l , 

I n — A - A tr = (A tr ) tr ■ A tr = ( A - 1 ) tr • A' 1 . 

Thus A~ l is orthogonal, as desired. □ 

Definition. The linear transformation h : R n — ► R n given by 

/i(x) = A • x 

is called an orthogonal transformation if A is an orthogonal matrix. This 
condition is equivalent to the requirement that h carry the basis ei, . . . , e n 
•for R n to an orthonormal basis for R n . 

Definition. Let h : R n — > R n . We say that h is a (euclidean) isometry 
if 

||ft(x) - /i(y)|| = ||x-y|| 

for all x,y E R n . Thus an isometry is a map that preserves euclidean dis- 
tances. 

Theorem 20.5. Let h : R n — ► R n be a map such that h( 0) = 0. 

(a) The map h is an isometry if and only if it preserves dot products. 

(b) The map h is an isometry if and only if it is an orthogonal 
transformation. 

Proof, (a) Given x and y, we compute: 

(1) ||A(x) - h( y)|| 2 = (h(x),h(x)) - 2{h(x),h(y)) + (h(y),h(y)) 

(2) ||x-y|| 2 = (x,x) - 2(x,y) + (y,y). 
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If h preserves dot products, then the right sides of (1) and (2) are equal; 
thus h preserves euclidean distances as well. Conversely, suppose h preserves 
euclidean distances. Then in particular, for all x, 

||fc(x) - fc(0)|| = II* - 0||, 

so that ||/i(x)|| = ||x||. Then the first and last terms on the right side of (1) 
are equal to the corresponding terms on the right side of (2). Furthermore, 
the left sides of (1) and (2) are equal by hypothesis. It follows that 

{h(x),h(y)) = (x, y), 


as desired. 

(b) Let h(x) = A • x, where A is orthogonal; we show h is an isometry. 
By (a), it suffices to show h preserves dot products. Now the dot product of 
h(x) and h{ y) can be expressed as the matrix product 

/i(x) tr • h(y) 

if /i(x) and h( y) are written as column matrices (as usual). We compute 


h(x) tr -h(y) = (A-x) tr -(A-y) 

= x tr • A tr • A ■ y = x tr • y. 

Thus h preserves dot products, so it is an isometry. 

Conversely, let h be an isometry with /i(0) = 0. Let a, be the vector 
a* = h(ei) for all *; let A be the matrix A = [ai • • • a n ]. Since h preserves 
dot products by (a), the vectors ai, . . . , a n are orthonormal; thus A is an 
orthogonal matrix. We show that h(x) = A • x for all x; then the proof is 
complete. 

Since the vectors a, form a basis for R n , for each x the vector h(x) can 
be written uniquely in the form 


n 

M x ) = ^a*(x)a<, 

i=l 

for certain real-valued functions 0!,(x) of x. Because the a,* are orthonormal, 

(h(x),aj) = otj (x) 

for each j. Because h preserves dot products, 


W*)»a j) = ( h(x),h(ej )) = (x,e ; -) = Xj 
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for all j . Thus ctj(x) = Xj, so that 


h(x) = ^ Xi&i = [ai • • • a„] • 

»=i 

Theorem 20.6. Let h : R n -* R n . Then h is an isometry if and 
only if it equals an orthogonal transformation followed by a translation, 
that is, if and only if h has the form 

/i(x) = A x + p, 

where A is an orthogonal matrix. 

Proof. Given h, let p = h{ 0), and define k(x) = h(x) — p. Then 

||fc(x) - My)ll = IIMx) - h(y)\U 

by direct computation. Thus k is an isometry if and only if h is an isometry. 

Since k( 0) = 0, the map k is an isometry if and only if A;(x) = A • x, 
where A is orthogonal. This in turn occurs if and only if h(x ) = >l*x + p. □ 

Theorem 20.7. Let h : R n -+ R n be an isometry. If S is a rectifi- 
able set in R n , then the set T = h(S) is rectifiable, and v(T) = v(S). 

Proof. The map h is of the form h(x) — Ax + P, where A is 
orthogonal. Then Dh(x) = A, and it follows from the change of variables 
theorem that 




LXn J 


= Ax. □ 


v(T ) — | det A\ * v(S ) = v(S). □ 
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EXERCISES 


1* Show that if h is an orthogonal transformation, then h carries every 
orthonormal set to an orthonormal set. 

2. Find a linear transformation h : R n — *■ R n that preserves volumes but is 
not an isometry. 

3. Let V be an arbitrary n-dimensional vector space. Show that the two 
orientations of V are well-defined. 

4. Consider the vectors a; in R 3 such that 


[»i a2 a3 a<] = 


" 1011 " 
10 11 
. 1120 . 


Let V be the subspace of R 3 spanned by ai and a 2 . Show that a 3 and a 4 
also span V , and that the frames (ai , a 2 ) and (a 3 , a 4 ) belong to opposite 
orientations of V . 

5. Given 6 and 0 , let 

aj = (cos 0, sin#) and a 2 = (cos(0 -f 0), sin(0 -f- <f>)) . 

Show that (ai,a 2 ) is right-handed if 0 < (f> < tt, and left-handed if 
— x < 0 < 0. What happens if 0 equals 0 or x? 




Manifolds 


We have studied the notion of volume for bounded subsets of euclidean space; 
if A is a bounded rectifiable set in R fc , its volume is defined by the equation 

v(A)= f 1 . 

Ja 

When k — 1 , it is common to call v(A) the length of A\ when k = 2, it is 
common to call v(A) the area of A. 

Now in calculus one studies the notion of length not only for subsets of 
R 1 , but also for smooth curves in R 2 and R 3 . And one studies the notion of 
area not only for subsets of R 2 , but also for smooth surfaces in R 3 . In this 
chapter, we introduce the ^-dimensional analogues of curves and surfaces; 
they are called k-manifolds in R n . And we define a notion of fc-dimensional 
volume for such objects. We also define what we mean by the integral of 
a scalar function over a /e-manifold with respect to /c-volume, generalizing 
notions defined in calculus for curves and surfaces. 


179 
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§21. THE VOLUME OF A PARALLELOPIPED 


We begin by studying paralielopipeds. Let V be a ^-dimensional parallelop- 
iped in R n , with k < n. We wish to define a notion of fc-dimensional vol- 
ume for V . (Its n-dimensional volume is of course zero, since it lies in a 
fc-dimensional subspace of R n , which has measure zero in R n .) How shall we 
proceed? There are two conditions that it is reasonable that such a volume 
function should satisfy. We know that an orthogonal transformation of R n 
preserves n-dimensional volume; it is reasonable to require that it preserve k- 
dimensional volume as well. Second, if the parallelopiped happens to lie in the 
subspace R fc x 0 of R n , then it is reasonable to require that its A;- dimensional 
volume agree with the usual notion of volume for a A;-dimensional parallelop- 
iped in R fc . These two “reasonable” conditions determine ^-dimensional vol- 
ume completely, as we shall see. 

We begin with a result from linear algebra which may already be familiar 
to you. 

Lemma 21.1. Let W be a linear subspace of R n of dimension k. 
Then there is an orthonormal basis for R n whose first Jc elements form 
a basis for W. 

Proof. By Theorem 1.2, there is a basis ai, . . . , a„ for R" whose first k 
elements form a basis for W . There is a standard procedure for forming from 
these vectors an orthogonal set of vectors bi, . . . , b n such that for each i, the 
vectors b 1? . . . , bj span the same space as the vectors a 1? . . . , a, . It is called 
the Gram- Schmidt process ; we recall it here. 

Given ai , . . . , a n , we set 


bi = ai, 

b2 = »2 “ , 


and for general i, 

bj = a,- — Ajibi — Aj 2 b 2 — • • • — A,j_ibi_i, 

where the A ,-j are scalars yet to be specified. No matter what these scalars are, 
however, we note that for each j the vector a j equals a linear combination of 
the vectors bi, . . . , bj. Furthermore, for each j the vector bj can be written 
as a linear combination of the vectors ai , . . . , aj . (The proof proceeds by 
induction.) These two facts imply that, for each i, ai, . . . , a, and b 1? . . . , bj 
span the same subspace of R n . It also follows that the vectors bi, . . . , b n are 
independent, for there are n of them, and they span R n as we have just noted. 
In particular, none of the bj can equal 0. 
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Now we note that the scalars A,j may in fact be chosen so that the vec- 
tors b, are mutually orthogonal. One proceeds by induction. If the vectors 
bi, .. ., b,_j are mutually orthogonal, one simply takes the dot product of 
both sides of the equation for b,- with each of the vectors bj for j = 1, . . . , i— 1 
to obtain the equation 


(b,- , bj) — (a,- , bjf ) A ij (bj , b j ) . 

Since bj / 0, there is a (unique) value of A ,j that makes the right side of 
this equation vanish. With this choice of the scalars A y, the vector b< is 
orthogonal to each of the vectors bj, . . . , b,_i. 

Once we have the non-zero orthogonal vectors b t) we merely divide each 
by its norm ||b t -|| to find the desired orthonormal basis for R n . □ 

Theorem 21.2. Let W be a k-dimensional linear subspace of R n . 
There is an orthogonal transformation h : R n — + R n that carries W onto 
the subspace R fc x 0 of R n . 

Proof. Choose an orthonormal basis bi, . . . , b n for R n such that the 
first k basis elements bi, . . . , b* form a basis for W. Let g : R n — *• R n be 
the linear transformation g(x) = B • x, where B is the matrix with successive 
columns bi, . . . , b n . Then g is an orthogonal transformation, and < 7 (e,-) = b,- 
for all i. In particular, g carries R fc x 0, which has basis e x , . . . , e*, onto W. 
The inverse of g is the transformation we seek. □ 

Now we obtain our notion of ^-dimensional volume. 

Theorem 21.3. There is a unique function V that assigns, to 
each k-tuple (xi, ...,Xfc) of elements of R n , a non-negative number 
such that: 

(1) Ifh : R" —R” is an orthogonal transformation, then 

V(h(x i), ..., h(x k )) = V(x u xjb). 

(2) If yi, . . . , yit belong to the subspace R* x 0 of R n , so that 


y» = 


Z| 

0 


for e R fc , then 


^(yi,...,yjb) = | det [zi ••• z*]|. 



182 Manifolds 


Chapter 5 


The function V vanishes if and only if the vectors xi, . . . , x* are de- 
pendent It satisfies the equation 

V(x u ...,x k ) = [det(X t '-X)] l '\ 

where X is the n by k matrix X = [xi •• • x*]. 

We often denote Vfxi, . . . , x*) simply by V{X). 

Proof. Given X = [xi • • • x*], define 

F(X) = det(X tr • X). 

Step 1. If h : R n — ► R n is an orthogonal transformation, given by the 
equation h(x) — A • x, where A is an orthogonal matrix, then 

F(A • X) = det((A • X) tr • (A • X)) 


= det(X tr • X) = F(X). 

Furthermore, if Z is a k by k matrix, and if Y is the n by k matrix 

Z 


Y = 


0 


then 


F(Y) = det ([Z tr 0] 


z 

0 


= det(Z tr • Z) = det 2 Z. 


Step 2. It follows that F is non-negative. For given x x , . . . , x* in R n , 
let IF be a Ar-dimensional subspace of R n containing them. (If the x,- are 
independent, W is unique.) Let h(x) = A x be an orthogonal transformation 
of R n carrying W onto the subspace R* x 0. Then A • X has the form 


AX 



7 


so that F(X) = F(A • X) = det 2 Z > 0. Note that F(X) = 0 if and only 
if the columns of Z are dependent, and this occurs if and only if the vectors 
Xi, . . . , X* are dependent. 

Step 3. Now we define V(X) — (.F(X)) 1/2 . It follows from the com- 
putations of Step 1 that V satisfies conditions (1) and (2). And it follows 
from the computation of Step 2 that V is uniquely characterized by these two 
conditions. □ 


Definition. If xi, . . . , x& are independent vectors in R n , we define the 
fc-dimensional volume of the parallelopiped V = V(xi, ..., Xjt) to be the 
number V^Xx, ..., x*), which is positive. 
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EXAMPLE 1. Consider two independent vectors a and b in R 3 ; let X be 
the matrix X = [a b]. Then V {X) is the area of the parallelogram with 
edges a and b. Let 6 be the angle between a and b, defined by the equation 
(a,b) = |jaj| ||b||cos0. Then 


(V(X )) 2 = det(X Xl ■ X) 


= det 


■ INI 2 

.( b > a ) 


(a,b)- 

IN 2 . 


= il a ll 2 ||b|| 2 (l — cos 2 9) = ||a|| 2 |[b|| 2 sin 2 9. 


Figure 21.1 shows why this number is interpreted in calculus as the square of 
the area of the parallelogram with edges a and b. 



Figure 21.1 


In calculus one studies another formula for the area of the parallelogram 
with edges a and b. If a x b is the cross product of a and b, defined by the 
equation 


a x b = det 


<22 " 
. 0-2 &3 . 


ei — det 


’ d\ bl ■ 

. 0-3 &3 . 


e 2 T det 


a i 

.an 


bi' 
bn . 


S3, 


then one learns in calculus that the number ||a xb|| equals the area of b). 
This is justified by verifying directly that 

!l a || 2 l|b|| 2 — (a, b) 2 = ||a x b|| 2 . 


Often this verification is left as an “exercise for the reader.” Some exercise! 


Just as there are for a parallelogram in R 3 , there are for a /b-parallelopiped 
in R n two different formulas for its ^-dimensional volume. The first is the 
formula given in the preceding theorem. It is very convenient for theoretical 
purposes, but sometimes not very pleasant for computational purposes. The 
second, which is a generalization of the cross-product formula just discussed, 
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is often more convenient to use in practice. We derive it now; it will be used 
in some of the examples and exercises. 

Definition. Let xi, . . . , x* be vectors in R n with k <n. Let X be the 
matrix X = [xi * • • x*]. If J = (*i, . . . , l*) is a fc-tuple of integers such that 
1 < i x < i 2 < • • • < ik < n, we call I an ascending fc-tuple from the set 
{1, . . . , n }, and we denote by 

Xi or by X(ii, *'fc) 

the k by k submatrix of X consisting of rows ii, . . . , ik of X . 

More generally, if I is any k - tuple of integers from the set {1, . . . , ft}, 
not necessarily distinct nor arranged in any particular order, we use this same 
notation to denote the k by k matrix whose successive rows are rows i i, . . . , ik 
of X . It need not be a submatrix of X in this case, of course. 

*Theorem 21.4. Let X be an n by k matrix with k <n. Then 

V(X) = l'£det 2 X A 1 ' 2 , 

m 

where the symbol [/] indicates that the summation extends over all as- 
cending k-tuples from the set {1, . . . , n }. 

This theorem may be thought of as a Pythagorean theorem for ^-volume. 
It states that the square of the volume of a /:-parallelopiped V in R n is equal 
to the sum of the squares of the volumes of the &-parallelopipeds obtained by 
projecting V onto the various coordinate Ar-planes of R n . 

Proof. Let X have size n by k. Let 

F(X) = det(X tr • X) and G(X) = £ det 2 X/. 

m 

Proving the theorem is equivalent to showing that F(X) = G(X ) for all X. 

Step 1. The theorem holds when k = 1 or k = n. If k = 1, then X is a 
column matrix with entries Aj, . .. , A n , say. Then 

F(X) = ^(AO 2 = G(X). 

If k = n, the summation in the definition of G has only one term, and 

F( X) = det 2 X = G(X). 
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Step 2. If X = [xi • • • xjb] and the x, are orthogonal, then 
F(X) = ||x 1 || 2 ||x 2 |p...||x*f. 

The general entry of X tr • X is x- r • Xj, which is the dot product of x, and 
Xj . Thus if the x, are orthogonal, X tr ■ X is a diagonal matrix with diagonal 
entries ||x t - 1 1 2 . 

Step 3. Consider the following two elementary column operations, where 
3 / 

(1) Exchange columns j and t. 

(2) Replace column j by itself plus c times column i. 

We show that applying either of these operations to X does not change the 
values of For G. 

Given an elementary row operation, with corresponding elementary ma- 
trix F, then F • X equals the matrix obtained by applying this elementary 
row operation to X . One can compute the effect of applying the correspond- 
ing elementary column operation to X by transposing X, premultiplying 
by F, and then transposing back. Thus the matrix obtained by applying an 
elementary column operation to X is the matrix 

(F • X tr ) tr = X • F tr . 

It follows that these two operations do not change the value of F. For 
F(X • F tr ) = det(F - X tr ■ X • F tr ) 

= (detF)(det(X tr -X))(det F tr ) 

= F(X), 

since det F = ±1 for these two elementary operations. 

Nor do these operations change the value of G. Note that if one applies 
one of these elementary column operations to X and then deletes all rows 
but i i, . . . , ik, the result is the same as if one had first deleted all rows but 
ii, . . . , ik and then applied the elementary column operation. This means 
that 

(X . F tr )/ = Xj • F tr . 

We then compute 

G(X ■ E tr ) = det2 ( x ■ E "h 

in 

= ^det 2 (X, -£ tr ) 

m 

= ^(det 2 X,)(det 2 E tr ) 

U) 


= G(X). 
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Step 4 • In order to prove the theorem for all matrices of a given size, 
we show that it suffices to prove it in the special case where all the entries of 
the bottom row are zero except possibly for the last entry, and the columns 
form an orthogonal set. 

Given X , if the last row of X has a non-zero entry, we may by elementary 
operations of the specified types bring the matrix to the form 


where A ^ 0. If the last row of X has no non-zero entry, it is already of 
this form, with A = 0. One now applies the Gram-Schmidt process to the 
columns of this matrix. The first column is left as is. At the general step, the 
j th column is replaced by itself minus scalar multiples of the earlier columns. 
The Gram-Schmidt process thus involves only elementary column operations 
of type (2). And the zeros in the last row remain unchanged during the 
process. At the end of the process, the columns are orthogonal, and the 
matrix still has the form of D. 

Step 5. We prove the theorem, by induction on n. 

If n = 1, then k = 1 and Step 1 applies. If n = 2, then k = 1 or k = 2, 
and Step 1 applies. Now suppose the theorem holds for matrices having fewer 
than n rows. We prove it for matrices of size n by k. In view of Step 1, we 
need only consider the case 1 < k < n. In view of Step 4, we may assume 
that all entries in the bottom row of X , except possibly for the last, are zero, 
and that the columns of X are orthogonal. Then X has the form 


X = 


bi 

0 


bjt-i 

0 


b k 

A 


the vectors b, of R n_1 are orthogonal because the columns of X are orthog- 
onal vectors in R n . For convenience in notation, let B and C denote the 
matrices 

B = [bi • • * bft] and C = [bi • • • b*_i]. 

We compute F(X) in terms of B and C as follows: 

F{X) = llbill 2 - ■ • Ilbjk-ill 2 ( IM 2 + A 2 ) by Step 2, 

= F(B) + X 2 F(C). 

To compute G(X ), we break the summation in the definition of G(X) 
into two parts, according to the value of i k . We have 

(*) G(X) = det2 Xi + det2 Xl ' 

i k <n ik-n 
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Now if I = (z'i, . . . , ik ) is an ascending fc-tuple with i* < 72 , then X\ — Bj. 
Hence the first summation in (*) equals G(B). On the other hand, if i* = 72 , 
one computes 

det X (2 1 , •••) 7 jb— 1 ? 72) — i A det C (ii, . • • > 1) • 

It follows that the second summation in (*) equals A 2 G(C). Then 

G(X) = G(B)+\ 2 G(C). 

The induction hypothesis tells us that F(B) — G(B) and F(C ) = G(C). It 
follows that F(X) = G(X). □ 


EXERCISES 


1. Let 


1 



a 


0 O' 

1 0 
0 1 
b c 


(a) Find X' 1 • X. 

(b) Find V(X). 

2. Let Xi . . . . , x* be vectors in R n . Show that 


V (xi , Axi, Xfc) = |A|V(xi, 


Xfc). 


3. Let h : R n — * R n be the function h(x) = Ax. If V is a fc-dimensional 
parallelopiped in R n , find the volume of h(V) in terms of the volume of V. 

4. (a) Use Theorem 21.4 to verify the last equation stated in Example 1. 

(b) Verify this equation by direct computation(J). 

5. Prove the following: 

Theorem. Let W be an n-dimensional vector space with an inner 
product. Then there exists a unique real-valued function V^xi, 

Xfc) of k- tuples of vectors ofW such that: 

(i) Exchanging with x } does not change the value ofV. 

(ii) Replacing Xi by Xi + cxj (for j / i ) does not change the value 
ofV. 

(iii) Replacing Xi by Xxi multiplies the value ofV by |A|. 

(iv) If the Xi are orthonormal , then V(xi, . . . , x*) = 1. 

Proof, (a) Prove uniqueness. [Hint: Use the Gram-Schmidt process.] 

(b) Prove existence. [Hint: If / : W — ► R n is a linear transformation 
that carries an orthonormal basis to an orthonormal basis, then / 
carries the inner product on W to the dot product on R n .j 
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§22. THE VOLUME OF A PARAMETRIZED-MANIFOLD 


Now we define what we mean by a parametrized- manifold in R n , and we 
define the volume of such an object. This definition generalizes the definitions 
given in calculus for the length of a parametrized-curve, and the area of a 
parametrized-surface, in R 3 . 

Definition. Let k < ti. Let A be open in R fc , and let ot : A — * R n be a 
map of class C r {r > 1). The set Y = a(A), together with the map a, con- 
stitute what is called parametrized-manifold, of dimension k. We denote 
this parametrized-manifold by Y a \ and we define the (fc- dimensional) volume 
of Y Q by the equation 

v(Y q ) = f V(Da ), 

JA 

provided the integral exists. 

Let us give a plausibility argument to justify this definition of volume. 
Suppose A is the interior of a rectangle Q in R fc , and suppose a : A — ► R n 
can be extended to be of class C r in a neighborhood of Q. Let Y — <*(A). 
Let P be a partition of Q. Consider one of the subrectangles 

R = [«i,ai + hi] x • • • x + h k ] 

determined by P. Now R is mapped by a onto a “curved rectangle” contained 
in y . The edge of R having endpoints a and a + /i t e t is mapped by a into a 
curve in R n ; the vector joining the initial point of this curve to the final point 
is the vector 

<*(a + — <*(a). 

A first-order approximation to this vector is, as we know, the vector 
Vj = Da(a) • hiGi ~ ( da/dxi ) • hi. 



Figure 22.1 
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It is plausible therefore to consider the ^-dimensional parallelopiped V whose 
edges are the vectors v, to be in some sense a first-order approximation to 
the “curved rectangle” See Figure 22.1. The ^-dimensional volume of 

V is the number 


V(vi , .... vt) = V(da/dx t , da/dx t ) (hi---h k ) 

= V(Da( a)) • v(R ). 

When we sum this expression over all subrectangles R , we obtain a number 
which lies between the lower and upper sums for the function V(Dct) relative 
to the partition P. Hence this sum is an approximation to the integral 

f V(Da\, 

JA 

the approximation may be made as close as we wish by choosing an appropri- 
ate partition P . 

We now define the integral of a scalar function over a parametrized- 
manifold. 

Definition. Let A be open in R k ; let a : A — ► R rt be of class C r \ let 
Y — ol(A). Let / be a real-valued continuous function defined at each point 
of Y . We define the integral of / over Y a , with respect to volume, by 
the equation 

f fdV^f (foa)V(Da), 

JY a Ja 

provided this integral exists. 

Here we are reverting to “calculus notation” in using the meaningless 
symbol dV to denote the “integral with respect to volume.” Note that in this 
notation, 



We show that this integral is “invariant under reparametrization.” 

Theorem 22.1. Let g : A — ► B be a diffeomorphism of open sets 
in R*. Let j3 : B — ► R n be a map of class C r ; let Y = (5(B). Let a = (5og; 
then a : A — ► R n and Y — a(A). If f :Y —+ R is a continuous function, 
then f is integrable over Yp if and only if it is integrable over Y a ; in this 
case 



In particular, v(Y Q ) = v(Yp). 
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Figure 22.2 


Proof. We must show that 

J (f o P)V(DP) = j(f ° a)V(Da), 

where one integral exists if the other does. See Figure 22.2. 

The change of variables theorem tells us that 

f (fo0)V(DP)= f ((/ ° fi)°g) (y (DP) ° <?)| det Dg\. 

Jb Ja 

We show that 

(V(Dp)og)\detDg\ = V(Da), 

and the proof is complete. Let x denote the general point of A; let y = g(x). 
By the chain rule, 

Da(x) = D(3( y) • Dg(x). 

Then 

[K(Da(x))] 2 = det(£>s(x) tr • Df}(yT ■ D0(y) ■ Dg(x )) 

= det(£> ff (x)) 2 [V(^(y))] 2 - 

Our desired equation follows. □ 

A remark on notation. In this book, we shall use the symbol dV when 
dealing with the integral with respect to volume, to avoid confusion with 
the differential operator d and the notation f A du>, which we shall introduce 
in succeeding chapters. The integrals f A dV and f A duj are quite different 
notions. It is however common in the literature to use the same symbol d 
in both situations, and the reader must determine from the context which 
meaning is intended. 
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EXAMPLE 1. Let A be an open interval in R 1 , and let a : A — ► R" be a map 
of class C r . Let Y = a(A). Then Y a is called a parametrized-curve in R n , 
and its 1-dimensional volume is often called its length. This length is given 
by the formula 

jv W , 

since Doc is the column matrix whose entries are the functions doci/dt. This 
formula may be familiar to you from calculus, in the case n = 3, as the formula 
for computing the arc length of a parametrized-curve. 

EXAMPLE 2. Consider the parametrized-curve 

a(t) = (a cos /, a sin t) for 0 < t < 3tt. 

Using the formula of Example 1, we compute its length as 

[a 2 sin 2 t + a 2 cos 2 t] 1 ^ 2 = 37ra. 

See Figure 22.3. Since a is not one-to-one, what this number measures is not 
the actual length of the image set (which is the circle of radius a) but rather 
the distance travelled by a particle whose equation of motion is x = a(t) for 
0 < t < 3x. We shall later restrict ourselves to parametrizations that are 
one-to-one, to avoid this situation. 




Figure 22.3 


EXAMPLE 3. Let A be open in R 2 ; let a : A — ► R n be of class C T \ let 
Y — cr(A). Then Y a is called a parametrized-surface in R", and its 2- 
dimensional volume is often called its area. 

Let us consider the case n = 3. If we use (x,y) as the general point of 
R 2 , then Da = [da/dx da/dy ], and 
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(See Example 1 of the preceding section.) In particular, if c* has the form 

a(x,y) = (x,y,f{x,y)), 

where / : A — ► R is a C r function, then Y is simply the graph of /, and we 
have 

1 0 
Da = 0 1 

df/dx df/dy 

so that 

V(Y„) = j [1 + (df/dx) 2 + (df/dy) 2 ]' 12 . 

You may recognize these as formulas for surface area given in calculus. 


EXAMPLE 4. Suppose A is the open disc x 2 4- y 2 < a 2 in R 2 , and / is the 
function 

f(x,y) = [a 2 -x 2 -y 2 } 112 . 

The graph of / is called a hemisphere of radius a. See Figure 22.4. 



Let a(x,y) = (x, y, f(x, y)). You can check that 

V(Da) = a/(a 2 -x 2 -y 2 )' 12 , 

so that (using polar coordinates) 

v(Y a ) = / ar/(a 2 -r 2 )' 12 , 

Jb 
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where B is the open set (0, a) x (0, 2x) in the (r, #)-plane. This is an improper 
integral, so we cannot use the Fubini theorem, which was proved only for the 
ordinary integral. Instead, we integrate over the set (0, a n ) x (0, 27 t) using the 
Fubini theorem, where 0 < a n < a, and then we let a n — ► a. We have 

i?(Y a ) = lim (-27 ra)[(a 2 - a 2 ) 1/2 — a] = 27ra 2 . 


A different method for computing this area, one that avoids improper 
integrals, is given in §25. 


EXERCISES 

1. Let A be open in R fe ; let a : A — * R n be of class C T \ let Y = at{A). 
Suppose h : R n — * R n is an isometry; let Z = h(Y) and let (5 — h o a. 
Show that Y q and Zp have the same volume. 

2. Let A be open in R fc ; let / : A — * R be of class C r ; let Y be the graph 
of / in R fc+1 , parametrized by the function a : A —* R fc+1 given by 
ar(x) = (x,/(x)). Express u(Y a ) as an integral. 

3. Let A be open in R fc ; let a : A — ► R n be of class C r \ let Y = cr(A). 
The centroid c(Y») of the parametrized-manifold Y a is the point of R n 
whose i ih coordinate is given by the equation 

c,(Y a ) = [l/«(Vo)J J x, dV, 

where 7Ti : R n — *• R is the i th projection function. 

(a) Find the centroid of the parametrized-curve 

ot(t) = (acost,ash\t) with 0 < t < ir. 

(b) Find the centroid of the hemisphere of radius a in R 3 . (See Exam- 
ple 4.) 

*4. The following exercise gives a strong plausibility argument justifying our 
definition of volume. We consider only the case of a surface in R 3 , but a 
similar result holds in general. 

Given three points a,b,c in R 3 , let C be the matrix with columns 
b — a and c — a. The transformation h : R 2 — ► R 3 given by h(x) — C-x- (-a 
carries 0 ,e 1( e 2 to a, b, c, respectively. The image Y under h of the set 

A = {(x, y) j x > 0 and y > 0 and £ + ?/<l} 

is called the (open) triangle in R 3 with vertices a, b, c. See Figure 22.5. 
The area of the parametrized-surface Y/, is one-half the area of the par- 
allelopiped with edges b — a and c — a, as you can check. 
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Figure 22.5 


Now let Q be a rectangle in R 2 and let a : Q — * R 3 ; suppose a extends 
to a map of class C r defined in an open set containing Q. Let P be a 
partition of Q. Let R be a subrectangle determined by P, say 

R = [a, a + h] x [6, 6 + fc]. 

Consider the triangle Ai (R) having vertices 

a(a,b ), a(a + h,b), and a(a + h,b + k) 

and the triangle A 2 (-ft) having vertices 

a(a,6), a(a,b+k), and a(a + h, b 4- k). 

We consider these two triangles to be an approximation to the “curved 
rectangle” ae(R). See Figure 22.6. We then define 

A(P) = £[«(A,(ft))+t;(A 2 (ft))], 

R 

where the sum extends over all subrectangles R determined by P. This 
number is the area of a polyhedral surface that approximates ot(Q). 
Prove the following: 

Theorem. Let Q be a rectangle in R 2 and let a : A — ► R 3 be a 
map of class C r defined in an open set containing Q. Given e > 0, 
there is a 6 > 0 such that for every partition P of Q of mesh less 
than 6, 
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Figure 22.6 


Proof, (a) Given points Xi , . . . , X6 of Q, let 

' DiOti(xi) D 2 Q!i(xi)' 

X>a(xi , . . . , x 6 ) = D\a 2 (x 2 ) D 2 a 2 (x s ) . 

.£)i»3(x 3 ) D 2 a 3 (x 6 )_ 

Then Pa is just the matrix Da with its entries evaluated at different 
points of Q. Show that if R is a subrectangle determined by P, then 
there are points Xi , . . . , x$ of R such that 

v(Ai(R)) = | V(Va{x u x 6 )) • v{R). 

Prove a similar result for V (A 

(b) Given e > 0, show one can choose S > 0 so that if x,-,y< £ Q with 
|x« — y<| < 6 for i = 1 , . . . , 6, then 

\V(Va(xi , . . . , xe)) - V(Vc(y, y«))| < £. 


(c) Prove the theorem. 
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§23. MANIFOLDS IN R" 


Manifolds form one of the most important classes of spaces in mathemat- 
ics. They are useful in such diverse fields as differential geometry, theoretical 
physics, and algebraic topology. We shall restrict ourselves in this book to 
manifolds that are submanifolds of euclidean space R n . In a final chapter, we 
define abstract manifolds and discuss how our results generalize to that case. 

We begin by defining a particular kind of manifold. 

Definition. Let k > 0. Suppose that M is a subspace of R n having 
the following property: For each p £ M , there is a set V containing p that 
is open in M, a set U that is open in R k , and a continuous map a : U — ► V 
carrying U onto V in a one-to-one fashion, such that: 

(1) a is of class C r . 

(2) a* 1 : V — ► U is continuous. 

(3) Da(x) has rank k for each x £ U . 

Then M is called a fc-manifold without boundary in R n , of class C r . The 
map Ol is called a coordinate patch on M about p. 

Let us explore the geometric meaning of the various conditions in this 
definition. 

EXAMPLE 1. Consider the case k — 1. If a is a coordinate patch on M , the 
condition that Da have rank 1 means merely that Da ± 0. This condition 
rules out the possibility that M could have “cusps” and “corners.” For exam- 
ple, let a : R — *■ R 2 be given by the equation a(t) = (< 3 ,< 2 ), and let M be the 
image set of a. Then M has a cusp at the origin. (See Figure 23.1.) Here a 
is of class C°° and a” 1 is continuous, but Da does not have rank 1 at t = 0. 



Figure 23.1 


Similarly, let P : R R 2 be given by fi(t) = (f 3 , |f 3 |), and let N be the 
image set of (3. Then N has a corner at the origin. (See Figure 23.2.) Here 
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Figure 23.2 


P is of class C 7 (as you can check) and /3 1 is continuous, but D(3 does not 
have rank 1 at t = 0. 


EXAMPLE 2. Consider the case k = 2. The condition that Dc*(a) have rank 2 
means that the columns dot/dx i and dot/dx 2 of Dot are independent at a. 
Note that dotfdxj is the velocity vector of the curve f(t) = c*(a + tej ) and 
is thus tangent to the surface M. Then da/dx\ and dot/dx 2 span a 2- 
dimensional “tangent plane” to M. See Figure 23.3. 



M 


Figure 23.3 


As an example of what can happen when this condition fails, consider the 
function O' : R 2 — ► R 3 given by the equation 

a(x,y) =(x(x 7 +y 2 ), y(x 2 + j/ 2 ), x 7 + y 2 ) } 

and let M be the image set of a:. Then M fails to have a tangent plane at 
the origin. See Figure 23.4. The map ot is of class C°° and Ot ~ ^ is continuous, 
but Dot does not have rank 2 at 0. 
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Figure 23.4 


EXAMPLE 3. The condition that a 1 be continuous also rules out various 
sorts of “pathological behavior.” For instance, let cr be the map 

a(/) = (sin 2t) ( | cos 1 1, sin t) for 0 < t < tt, 

and let M be the image set of a. Then M is a “figure eight” in the plane. 
The map a is of class C 1 with Da of rank 1, and a maps the interval (0, w) 
in a one-to-one fashion onto A f. But the function or 1 is not continuous. For 
continuity of cr -1 means that a carries any set Uo that is open in U onto 
a set that is open in M. In this case, the image of the smaller interval Uo 
pictured in Figure 23.5 is not open in M. Another way of seeing that cr -1 is 
not continuous is to note that points near 0 in M need not map under a 
to points near tt/2. 



Figure 23.5 


EXAMPLE 4. Let A be open in R k ; let a : A — > R" be of class C r ; let 
Y — Then Y a is a parametrized-manifold; but Y need not be a mani- 

fold. However, if a is one-to-one and a -1 is continuous and Da has rank k, 
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then Y is a manifold without boundary, and in fact Y is covered by the single 
coordinate patch a. 

Now we define what we mean by a manifold in general. We must first gen- 
eralize our notion of differentiability to functions that are defined on arbitrary 
subsets of R k . 

Definition. Let 5 be a subset of R fc ; let / : S — * R n . We say that / is 
of class C r on S if / may be extended to a function g : U — ♦ R n that is of 
class C r on an open set U of R k containing S. 

It follows from this definition that a composite of C r functions is of class 
C r . Suppose S C R fc and f\ : S — ► R n is of class C r . Next, suppose that 
T CR" and fi(S) C T and f 2 : T — ► R p is of class C r . Then / 2 °/i : S — * R p 
is of class C r . For if gi is a C r extension of f\ to an open set U in R fc , and 
if <72 is a C r extension of f 2 to an open set V in R n , then g 2 o ^ is a C r 
extension of f 2 o fi that is defined on the open set g^iV) of R* containing S. 

The following lemma shows that / is of class C r if it is locally of class C r : 

Lemma 23.1. Let S be a subset ofR k ; let f : S — ► R n . If for each 
x € S, there is a neighborhood U x of x and a function g x : U x —> R n of 
class C r that agrees with f on U x n S, then f is of class C r on S. 

Proof The lemma was given as an exercise in §16; we provide a proof 
here. Cover S by the neighborhoods U x \ let A be the union of these neigh- 
borhoods; let {<^, } be a partition of unity on A of class C r dominated by the 
collection {U x }. For each i, choose one of the neighborhoods U x containing 
the support of and let denote the C r function g x : U x — * R n . The C r 
function fagi : U x — ► R n vanishes outside a closed subset of U x , we extend 
it to a C r function h{ on all of A by letting it vanish outside U x . Then we 
define 

oo 

</(*) = Yw*) 

i=l 

for each x £ A. Each point of A has a neighborhood on which g equals a 
finite sum of functions /i,-; thus g is of class C r on this neighborhood and 
hence on all of A. Furthermore, if x G S , then 

M x ) = &(x)<7i(x) = 0i(x)/(x) 
for each i for which <^,(x) / 0. Hence if x € 5, 

0 ( x ) = ^( x )/( x ) = /( x )- □ 

»=i 
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Definition. Let H* denote upper half-space in R fc , consisting of those 
x E R* for which Xk > 0. Let H+ denote the open upper half-space, 
consisting of those x for which Xk > 0. 

We shall be particularly interested in functions defined on sets that are 
open in but not open in R*. In this situation, we have the following useful 
result: 

Lemma 23.2. Let U be open in H* but not in R k ; let a :U — * R n 
be of class C r . Let (3 : U' — R” be a C r extension of a defined on an 
open set U' ofR k . Then for x 6 U , the derivative D/3(x) depends only 
on the function a and is independent of the extension (3. It follows that 
we may denote this derivative by Da(x) without ambiguity. 

Proof Note that to calculate the partial derivative dfii/dxj at x, we 
form the difference quotient 

[(3(x + hej) - P(x)\/ h 

and take the limit as h approaches 0. For calculation purposes, it suffices to 
let h approach 0 through positive values. In that case, if x is in H* then so 
is x -(- hej . Since the functions (3 and a agree at points of H fc , the value of 
Df3(x) depends only on a. See Figure 23 . 6 . □ 



Figure 23.6 


Now we define what we mean by a manifold. 

Definition. Let k > 0. A /^-manifold in R n of class C r is a subspace 
M of R n having the following property: For each p G M , there is an open 
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set V of M containing p, a set U that is open in either R* or H fc , and a 
continuous map a : U —* V carrying U onto V in a one-to-one fashion, such 
that: 

(1) a is of class C r . 

(2) a -1 :V—*U is continuous. 

(3) Dot(x) has rank k for each x € U . 

The map a is called a coordinate patch on M about p. 

We extend the definition to the case k = 0 by declaring a discrete collec- 
tion of points in R" to be a 0-manifold in R n . 

Note that a manifold without boundary is simply the special case of a 
manifold where all the coordinate patches have domains that are open in R*. 

Figure 23.7 illustrates a 2-manifold in R 3 . Indicated are two coordinate 
patches on M , one whose domain is open in R 2 and the other whose domain 
is open in H 2 but not in R 2 . 



Figure 23. 7 


It seems clear from this figure that in a k- manifold, there are two kinds of 
points, those that have neighborhoods that look like open &-balls, and those 
that do not but instead have neighborhoods that look like open half-balls of 
dimension k. The latter points constitute what we shall call the boundary 
of M . Making this definition precise, however, requires a certain amount of 
effort. We shall deal with this question in the next section. 

We close this section with the following elementary result: 
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Lemma 23.3. Let M be a manifold in R n , and let a : U —* V be 
a coordinate patch on M. If Uo is a subset of U that is open in U, then 
the restriction of a. to Uo is also a coordinate patch on M. 

Proof. The fact that Uo is open in U and a -1 is continuous implies that 
the set Vo = a(Uo) is open in V . Then Uo is open in R* or H* (according 
as U is open in R fc or H fc ), and Vo is open in M . Then the map a\Uo is a 
coordinate patch on M: it carries Uo onto Vq in a one-to-one fashion; it is of 
class C r because a is; its inverse is continuous being simply a restriction of 
a -1 ; and its derivative has rank k because Da does. □ 

Note that this result would not hold if we had not required a -1 to be 
continuous. The map a of Example 3 satisfies all the other conditions for 
a coordinate patch, but the restricted map a\Uo is not a coordinate patch 
on M , because its image is not open in M. 

EXERCISES 

1. Let a : R — R 2 be the map = {x,x 7 )\ let M be the image set of 

a. Show that M is a 1-manifold in R 2 covered by the single coordinate 
patch a. 

2. Let P : H 1 — R 2 be the map fi(x) = (x,£ 2 ); let N be the image set of /?. 
Show that TV is a 1-manifold in R 2 . 

3. (a) Show that the unit circle S 1 is a 1-manifold in R 2 . 

(b) Show that the function a : [0, 1) — + S 1 given by 

<*(/) = (cos 2 tt<, sin 2irt) 
is not a coordinate patch on S 1 . 

4. Let A be open in R fc ; let / : A R be of class C r . Show that the graph 
of / is a fc-manifold in R* + . 

5. Show that if M is a fc-manifold without boundary in R m , and if N is an 
^-manifold in R n , then Af X iV is a fc + f manifold in R m+n . 

6. (a) Show that I = [0, 1] is a 1-manifold in R 1 . 

(b) Is / x I a 2-manifold in R 2 ? Justify your answer. 
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§24. THE BOUNDARY OF A MANIFOLD 


In this section, we make precise what we mean by the boundary of a manifold; 
and we prove a theorem that is useful in practice for constructing manifolds. 

To begin, we derive an important property of coordinate patches, namely, 
the fact that they “overlap differentiably.” We make this statement more 
precise as follows: 

Theorem 24.1. Let M be a k-manifold in R n , of class C r . Let 
o 0 : Uq — > ► Vq and o>i : U\ — ► V\ be coordinate patches on M, with 
W = V Q n Vi non-empty. Let Then the map 

a ] -1 o a 0 : Wq — ► W\ 

is of class C r , and its derivative is non-singular. 

Typical cases are pictured in Figure 24.1. We often call o a o the 
transition function between the coordinate patches Oq an d ol\. 



Figure 24. 1 


Proof. It suffices to show that if a : U — * V is a coordinate patch on M , 
then a -1 : V — *■ R fc is of class C r , as a map of the subset V of R n into R fc . For 
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then it follows that, since Cto and are of class C r , so is their composite 
a]" 1 o £* 0 . The same argument applies to show c*q 1 oftj is of class C r \ then 
the chain rule implies that both these transition functions have non-singular 
derivatives. 

To prove that a -1 is of class C r , it suffices (by Lemma 23.1) to show that 
it is locally of class C r . Let p 0 be a point of V ; let £* _1 (Po) = x 0 . We show 
a -1 extends to a C r function defined in a neighborhood of po in R n . 

Let us first consider the case where U is open in H* but not in R fc . By 
assumption, we can extend a to a C r map /? of an open set U' of R* into 
R n . Now Da(x o) has rank k , so some k rows of this matrix are independent; 
assume for convenience the first k rows are independent. Let 7T : R” — ► R fc 
project R n onto its first k coordinates. Then the map g = ir o /3 maps U' into 
R fc , and Dg(xo) is non-singular. By the inverse function theorem, g is a C T 
diffeomorphism of an open set W of R k about Xo with an open set in R fc . See 
Figure 24.2. 



We show that the map h = g~ l o 7T, which is of class C r , is the desired 
extension of a" 1 to a neighborhood A of po- To begin, note that the set 
Uq — W D U is open in U , so that the set Vq = &{Uq) is open in V; this 
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means there is an open set A of R n such that A fl V = Vo. We can choose A 
so it is contained in the domain of h (by intersecting with 7r ~ 1 (g(W)} if 
necessary). Then h : A — ► R* is of class C r ; and if p G A n V = Vq, then we 
let x = <^ _1 (p) and compute 

h(p) = h(a(x)) = g~ l (x(a(x))) = g~ 1 (g(x)) = x = a~ 1 ( p), 
as desired. 

A similar argument holds if U is open in R*. In this case, we set U' = U 
and fl = a, and the preceding argument proceeds unchanged. □ 

Now we define the boundary of a manifold. 

Definition. Let M be a fc-manifold in R n ; let p 6 M. If there is a 
coordinate patch a : U — * V on M about p such that U is open in R fc , we 
say p is an interior point of M . Otherwise, we say p is a boundary point 
of M . We denote the set of boundary points of M by dM , and call this set 
the boundary of M . 

Note that our use here of the terms “interior” and “boundary” has noth- 
ing to do with the way these terms are used in general topology. Any subset S 
of R n has an interior and a boundary and an exterior in the topological sense, 
which we denote by Int S and Bd S and Ext S , respectively. For a mani- 
fold M , we denote its boundary by dM and its interior by M — dM. 

Given M, one can readily identify the boundary points of M by use of 
the following criterion: 

Lemma 24.2. Let M be a k-manifold in R n ; let a : U —> V be a 
coordinate patch about the point p of M . 

(a) If U is open in R fc , then p is an interior point of M. 

(b) If U is open in H* and if p = a(xo) for xo € H^., then p is an 
interior point of M. 

(c) If U is open in H* and p = a(x 0 ) for x 0 € R fc_1 x 0, then p is a 
boundary point of M. 

Proof, (a) is immediate from the definition, (b) is almost as easy. Given 
a : U — > V as in (b), let Uq = U fl Hj. and let V 0 = a(Uo). Then a\U 0 , 
mapping Uq onto Vq, is a coordinate patch about p, with Uq open in R*. 

We prove (c). Let Q?o • Uq — ► Vq be a coordinate patch about p, with 
Uq open in H fc and p = ao(xo) for xo 6 R i_1 x 0. We assume there is a 
coordinate patch ot.\ : U\ — ► V\ about p with U\ open in R*. and derive a 
contradiction. 



206 Manifolds 


Chapter 5 


Since Vo and V\ are open in M , the set W = Vo fl V\ is also open in M . 
Let Wi = a t _1 (iy) for i = 0, 1; then W 0 is open in H fc and contains x 0 , and 
W\ is open in R k . The preceding theorem tells us that the transition function 

c*o 1 o ax : W\ — > Wo 

is a map of class C r carrying Wi onto Wo in a one-to-one fashion, with 
non-singular derivative. Then Theorem 8.2 tells us that the image set of this 
map is open in R fc . But Wo is contained in H* and contains the point x 0 of 
R fc_1 x 0, so it is not open in R fc ! See Figure 24.3. □ 



Note that H k is itself a ^-manifold in R k ; and it follows from this lemma 
that dH k — R^ — 1 x 0. 

Theorem 24.3. Let M be a k-manifold in R n , of class C r . If dM 
%s non-empty, then dM is a k - 1 manifold without boundary in R n of 
class C r . 

Proof. Let p G dM. Let a : U — * V be a coordinate patch on M 
about p. Then U is open in and p = Qi(xo) for some Xo G dW k . By the 
preceding lemma, each point of U 0 H^. is mapped by ct to an interior point 
of M, and each point of U fl (0H*) is mapped to a point of dM. Thus the 
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restriction of ct to U PI (dH*) carries this set in a one-to-one fashion onto the 
open set Vo = V fl dM of dM . Let Uo be the open set of R fc_1 such that 
Uo x 0 = U fl dH k ; if x G Uo, define a 0 (x) = a(x,0). Then c*o • Uo — ► Vo is 
a coordinate patch on dM . It is of class C r because ct is, and its derivative 
has rank k — 1 because Doto(x) consists simply of the first k — 1 columns 
of the matrix Da(x, 0). The inverse qlq 1 is continuous because it equals the 
restriction to Vq of the continuous function a -1 , followed by projection of R* 
onto its first k — 1 coordinates. □ 

The coordinate patch c*o on dM constructed in the proof of this theorem 
is said to be obtained by restricting the coordinate patch a on M . 

Finally, we prove a theorem that is useful in practice for constructing 
manifolds. 

Theorem 24.4. Let O be open in R”; let f : O — ► R be of class C r . 
Let M be the set of points x for which f(x) = 0; let N be the set of points 
for which /(x) > 0. Suppose M is non-empty and Df(x) has rank 1 at 
each point of M. Then N is an n-manifold in R n and dN = M. 

Proof Suppose first that p is a point of N such that /( p) > 0. Let 
U be the open set in R n consisting of all points x for which /(x) > 0; let 
a : U — ► U be the identity map. Then a is (trivially) a coordinate patch 
on N about p whose domain is open in R n . 

Now suppose that /( p) = 0. Since Df( p) is non-zero, at least one of 
the partial derivatives Dif( p) is non-zero. Suppose D n f( p) ^ 0. Define 
F : O — ► R n by the equation F(x) = (x 1? . . . , ® n -i»/(x)). Then 

In- 1 0 

* Dnf\ ’ 

so that DF( p) is non-singular. It follows that F is a diffeomorphism of a 
neighborhood A of p in R" with an open set B of R n . Furthermore, F carries 
the open set A fl of N onto the open set B fl H” of H n , since x G N if and 
only if /(x) > 0. It also carries A fl M onto B fl dW n , since x G M if and 
only if /(x) = 0. Then F' 1 : B n H n -* A C\ N is the required coordinate 
patch on N . See Figure 24.4. □ 


DF = 


Definition. Let B n (a) consist of all points x of R n for which ||x|| < a, 
and let 5 n " 1 (a) consist of all x for which ||x|| = a. We call them the n-ball 
and the n — 1 sphere, respectively, of radius a. 
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Corollary 24.5. The n-ball B n (a) is an n-manifold in R n of class 
C°° , and S n ~ l (a) = dB n (a). 

Proof. We apply the preceding theorem to the function /(x) = a 2 — 
||x[| 2 . Then 

Df(x) = [(-2x 1 ) --- (— 2x n )], 

which is non-zero at each point of S n ~ l (a). □ 


EXERCISES 

1. Show that the solid torus is a 3-manifold, and its boundary is the torus T. 
(See the exercises of §17.) [Hint: Write the equation for T in cartesian 
coordinates and apply Theorem 24.4.] 

2. Prove the following: 

Theorem. Let f : R n+fc -*■ R n be of class C r . Let M be the set of all 
x such that /(x) = 0. Assume that M is non-empty and that D/(x) 
has rank n forx G M. Then M is a k-manifold without boundary in 
R n +fc Furthermore, if N is the set of all x for which 

/i(x) = • • ■ = fn—i (x) = 0 and /n(x) > 0, 

and if the matrix 

d(fu fn-l)/dx 
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has rank n — 1 at each point of N, then N is a k + 1 manifold , and 

dN = M. 

3. Let f,9 : R 3 - R be of class C r . Under what conditions can you be 
sure that the solution set of the system of equations f(x,y, z) — 0, 
9( x iVi z ) = 0 is a smooth curve without singularities (i.e., a 1-manifold 
without boundary)? 

4. Show that the upper hemisphere of 5 n “ 1 (a), defined by the equation 

E+- l (a) — S n ~ l (a) n H n , 

is an n — 1 manifold. What is its boundary? 

5. Let 0(3) denote the set of all orthogonal 3 by 3 matrices, considered as 
a subspace of R 9 . 

(a) Define a C°° function / : R 9 — ♦ R 6 such that 0(3) is the solution set 
of the equation /(x) = 0. 

(b) Show that 0(3) is a compact 3-manifold in R 9 without boundary. 
[Hint: Show the rows of Df(x ) are independent if x E (9(3).] 

6. Let O(n) denote the set of all orthogonal n by n matrices, considered 
as a subspace of R - ^, where N = n 2 . Show O(n) is a compact manifold 
without boundary. What is its dimension? 

The manifold 0(n ) is a particular example of what is called a Lie 
group (pronounced “lee group”). It is a group under the operation of 
matrix multiplication; it is a C°° manifold; and the product operation and 
the map A — ► A~ 1 are C°° maps. Lie groups are of increasing importance 
in theoretical physics, as well as in mathematics. 


§25. INTEGRATING A SCALAR FUNCTION OVER A MANIFOLD 

Now we define what we mean by the integral of a continuous scalar function / 
over a manifold M in R n . For simplicity, we shall restrict ourselves to the case 
where M is compact. The extension to the general case can be carried out 
by methods analogous to those used in §16 in treating the extended integral. 

First we define the integral in the case where the support of / can be 
covered by a single coordinate patch. 

Definition. Let M be a compact fc-manifold in R n , of class C r . Let 
/ : M — ► R n be a continuous function. Let C = Support /; then C is 
compact. Suppose there is a coordinate patch a : U — > V on M such that 
C C V. Now a -1 (C) is compact. Therefore, by replacing U by a smaller open 
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set if necessary, we can assume that U is bounded. We define the integral 
of / over M by the equation 



f (foa)V(Da). 

J Int U 


Here Int U = U if U is open in R*, and Int U = U fl H+ if U is open in H fc 
but not in R fc . 


It is easy to see this integral exists as an ordinary integral, and hence 
an extended integral: The function F = (/ o a)V(Da) is continuous on 
and vanishes outside the compact set a -1 (C); hence F is bounded. If U is 
open in R k , then F vanishes near each point xo of Bd U . If U is not open 
in R k , then F vanishes near each point of Bd U not in c?H fc , a set that has 
measure zero in R*. In either case, F is integrable over U and hence over 
Int U . See Figure 25.1. 



Figure 25.1 


Lemma 25.1. If the support of f can be covered by a single coor- 
dinate patch, the integral f M f dV is well-defined, independent of the 
choice of coordinate patch . 


■ 8S 
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Proof. We prove a preliminary result. Let a : U -* V be a coordinate 
patch containing the support of /. Let W be an open set in U such that 
a(W) also contains the support of /. Then 


f (f oa)V(Da) = / 

J Int W J Int U 


(foa)V(Da)-, 


the (ordinary) integrals over W and V are equal because the integrand van- 
ishes outside W; then one applies Theorem 13.6. 

Let a,- : U{ — ► V* for i — 0, 1 be coordinate patches on M such that both 
Vo and Vi contain the support of /. We wish to show that 


/ (foa 0 )V(Ba 0 )= ! (/ o ai)V(Dai). 

J Int U 0 J Int U, 


Int [/, 


Let W = VoflVi and let Wi = otJ l {W). In view of the result of the preceding 
paragraph, it suffices to show that this equation holds with Ui replaced by 
W(, for i = 0,1. Since aj" 1 o a 0 : Int W 0 Int is a diffeomorphism, this 
result follows at once from Theorem 22.1. □ 


To define f M f dV in general, we use a partition of unity on M . 

Lemma 25.2. Let M be a compact k-manifold in R n , of class C r . 
Given a covering of M by coordinate patches , there exists a finite col- 
lection of C°° functions <!> u . . . , <j> t mapping R n into R such that: 

(1) <j>i(x) > 0 for all x. 

(2) Given i, the support of fa is compact and there is a coordinate 
patch a, . Ui —>Vi belonging to the given covering such that 

((Support <f>i)nM) c Vi. 

(3) S>M = 1 f or xgM. 

We call {<^i, . .., <f>t) a partition of unity on M dominated by the 
given collection of coordinate patches. 

Proof. For each coordinate patch a : U — ► V belonging to the given 
collection, choose an open set A v of R n such that Ay D M = V. Let A be 
the union of the sets Ay. Choose a partition of unity on A that is dominated 
by this open covering of A. Local finiteness guarantees that all but finitely 
many of the functions in the partition of unity vanish identically on M . Let 
<f>i, . . . , (f>i be those that do not. □ 

Definition. Let M be a compact A:-manifold in R n , of class C r . Let 
/ : M — R be a continuous function. Choose a partition of unity <f>i, . . . , (f>t 
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on M that is dominated by the collection of all coordinate patches on M . We 
define the integral of / over M by the equation 

/ /«»' = £[/’ Wi/Mn 

Jm Jm 

Then we define the (fc-dimensional) volume of M by the equation 


1 dV. 


v(M) — f 
Jm 

If the support of / happens to lie in a single coordinate patch a : U — * V, 
this definition agrees with the preceding definition. For in that case, letting 
A = Int U , we have 

y; [ / ( 4>if ) dVI = y [ / (^* ° <*)(/ ° ol)V (D a)] by definition, 

S , = 1 ./* 

= / o a)(/ o a)V(Da)] by linearity, 

J A i 


= / (/ oa)F(i)a) because oa) = 1 on v4, 

i-l 

= f dV by definition. 

Jm 

We note also that this definition is independent of the choice of the par- 
tition of unity. Let ipi, . . . , be another choice for the partition of unity. 
Because the support of if)j f lies in a single coordinate patch, we can apply 
the computation just given (replacing / by tpj /) to conclude that 

£ I f dV 'l = / M /) dV - 

71 T Jm Jm 


Summing over j , we have 

m / 


£ £[/ (**/) dt 1 = £[/ M»/) dV] 

, _i .-_1 Jm j _ i Af 


J=1 *=1 


Symmetry shows that this double summation also equals 

t 


£[/ (A/) dV], 

i=l 


as desired. 

Linearity of the integral follows at once. We state it formally as a theorem: 



§ 25 . 


Integrating a Scalar Function over a Manifold 213 


Theorem 25.3. Let M be a compact k-manifold in R”, of class C r . 
Let f,g : M — ► R be continuous. Then 

f (a f + bg) dV = a f f dV + b f g dV. □ 

Jm Jm Jm 

This definition of the integral f M f dV is satisfactory for theoretical pur- 
poses, but not for practical purposes. If one wishes actually to integrate a 
function over the n — 1 sphere iS*” -1 , for example, what one does is to break 
S n_1 into suitable “pieces,” integrate over each piece separately, and add the 
results together. We now prove a theorem that makes this procedure more 
precise. We shall use this result in some examples and exercises. 

Definition. Let M be a compact fc-manifold in R”, of class C r . A 
subset D of M is said to have measure zero in M if it can be covered by 
countably many coordinate patches a,- : £/,■ — ► Vi such that the set 

D t = a- 1 (DnV i ) 

has measure zero in R* for each i. 

An equivalent definition is to require that for any coordinate patch 
a : U — * V on AT, the set a~ 1 (D fl V) have measure zero in R*. To verify 
this fact, it suffices to show that a~ l {DC\V flK) has measure zero for each i. 
And this follows from the fact that the set a~ l {D C\V fl Vi) has measure zero 
because it is a subset of D,-, and that a -1 o a,- is of class C r . 

*Theorem 25.4. Let M be a compact k-manifold in R n , of class 
C r . Let f : M —* R be a continuous function. Suppose that a, : A, — ► 
M i} for i = 1 , . . . , N, is a coordinate patch on M , such that A, is open 
in R* and M is the disjoint union of the open sets M iy .. ., M n of M 
and a set K of measure zero in M. Then 

(*) / /dV = £[/ (/oa,)V(Da,)]. 

JM ~7 Ja, 

This theorem says that f M f dV can be evaluated by breaking M up 
into pieces that are parametrized-manifolds and integrating / over each piece 
separately. 

Proof. Since both sides of (*) are linear in /, it suffices to prove the 
theorem in the case where the set C = Support / is covered by a single 
coordinate patch a : U — ► V. We can assume that U is bounded. Then 

f fdV=[ (foa)V(Da), 

Jm J int u 


by definition. 
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Bd not in L. Then we note that 


II 

* 1 

w- 

f F 

by additivity, 

(Int U)—L 


-j 

( F 

Int U 

since L has measure zero 

=j 

f f dV 

M 

by definition. 


Step 2. We complete the proof by showing that 



where Fi — (/ o on)V(DoLi). See Figure 25.4. 



The map cx t “ 


o a is a diffeomorphism carrying W{ onto the open set 


B i = a- 1 (M i nV) 
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of R* . It follows from the change of variables theorem that 



just as in Theorem 22.1. To complete the proof, we show that 



These integrals may not be ordinary integrals, so some care is required. 

Since C — Support / is closed in M , the set is closed in and 

its complement 

D i = A i -aj\C) 

is open in Ai and thus in R*. The function Fi vanishes on 22,-. We apply 
additivity of the extended integral to conclude that 

f Fi = f F> + / Fi - f Ft. 

J A i v JD t JB{C\D x 

The last two integrals vanish. □ 

EXAMPLE 1. Consider the 2-sphere S 2 (a) of radius a in R 3 . We computed 
the area of its open upper hemisphere as 27ra 2 . (See Example 4 of §22.). Since 
the reflection map (x, y, z) — ► (x, y, —z) is an isometry of R 3 , the open lower 
hemisphere also has area 27ra 2 . (See the exercises of §22.) Since the upper 
and lower hemispheres constitute all of the sphere except for a set of measure 
zero in the sphere, it follows that 5 2 (a) has area 47ra 2 . 

EXAMPLE 2. Here is an alternate method for computing the area of the 2- 
sphere; it involves no improper integrals. 

Given Zq 6 R with \Zo\ < a, the intersection of S 2 (a) with the plane 
Z — Z o is the circle 


z=z 0 ; x 2 + y 2 = a 2 — (z 0 ) 2 . 

This fact suggests that we parametrize 5 2 (a) by the function or : >1 — ► R 3 
given by the equation 

cr(*, z ) = ((a 2 - z 2 ) 1/2 cos t, (a 2 - z 2 ) 1/2 sint, z), 

where A is the set of all (t,z) for which 0 < t < 2x and | z\ < a. It is easy 
to check that a is a coordinate patch that covers all of S 2 (a) except for a 
great-circle arc, which has measure zero in the sphere. See Figure 25.5. By 
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the preceding theorem, we may use this coordinate patch to compute the area 
of S 2 (a). We have 


Da = 


‘ -(a 2 - z 2 ) 1 } 2 sin t 
(a 2 — z 2 ) 1/2 cos t 
0 


(— z cost)/ {a 2 — z 2 ) 1 ^ 2 ' 
(— zsint)/(a 2 — z 2 ) 1 / 2 , 


whence V (Da) — a, as you can check. Then i>(S 2 (a)) = J a = 47ra 2 . 



Figure 25.5 


EXERCISES 

1. Check the computations made in Example 2. 

2. Let a(t), /3(t),f(t) be real-valued functions of class C 1 on [0,1], with 
/(<) > 0. Suppose M is a 2-manifold in R 3 whose intersection with the 
plane Z — t is the circle 

(x-a(t)) 2 + (y-p(t)) 2 = z — t 

if 0 < t < 1, and is empty otherwise. 

(a) Set up an integral for the area of M . [Hint: Proceed as in Example 2.] 

(b) Evaluate when a and 0 are constant and f(t) — 1 -f t 2 . 

(c) What form does the integral take when f is constant and cr(£) = 0 
and 0(t) = at? (This integral cannot be evaluated in terms of the 
elementary functions.) 

3. Consider the torus T of Exercise 7 of §17. 

(a) Find the area of this torus. [Hint: The cylindrical coordinate trans- 
formation carries a cylinder onto T. Parametrize the cylinder using 
the fact that its cross-section are circles.] 
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(b) Find the area of that portion of T satisfying the condition x 2 -|- 

y 2 > b 2 . 

4. Let M be a compact fc-manifold in R n . Let h : R n — +• R n be an isometry; 
let N = h(M). Let / : N — R be a continuous function. Show that N 
is a fc-manifold in R n , and 

f fdV = f (/ o h) dV. 

Jn Jm 

Conclude that M and N have the same volume. 

5. (a) Express the volume of S n (a) in terms of the volume of jB n “ 1 (a). 

[Hint: Follow the pattern of Example 2.] 

(b) Show that for t > 0, 

„(£"(*)) = Dv(B n+ '{t)). 

[Hint: Use the result of Exercise 6 of §19.] 

6. The centroid of a compact manifold M in R n is defined by a formula 
like that given in Exercise 3 of §22. Show that if M is symmetric with 
respect to the subspace Xi = 0 of R n , then d(M) = 0. 

*7. Let E+(a) denote the intersection of *S n (a) with upper half-space H n+1 . 
Let An = v(B n (l)). 

(a) Find the centroid of E+(d) in terms of X n and A„_i. 

(b) Find the centroid of E+(a) in terms of the centroid of 23^ -1 (a). (See 
the exercises of §19.) 

8. Let M and N be compact manifolds without boundary in R m and R n , 
respectively. 

(a) Let / : M ~ ► R and g : N —+ R be continuous. Show that 


f gdV 


f dV)[ gdV]. 


[Hint: Consider the case where the supports of / and g are contained 
in coordinate patches.] 

(b) Show that v(M x N) = v(M ) • v(N). 

(c) Find the area of the 2-manifold S 1 x S 1 in R 4 . 
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We have treated, with considerable generality, two of the major topics of 
multivariable calculus — differentiation and integration. We now turn to the 
third topic. It is commonly called “vector integral calculus,” and its major 
theorems bear the names of Green, Gauss, and Stokes. In calculus, one limits 
oneself to curves and surfaces in R 3 . We shall deal more generally with k - 
manifolds in R n . In dealing with this general situation, one finds that the 
concepts of linear algebra and vector calculus are no longer adequate. One 
needs to introduce concepts that are more sophisticated; they constitute a 
subject called multilinear algebra that is a sequel to linear algebra. 

In the first three sections of this chapter, we introduce this subject; in 
these sections we use only the material on linear algebra treated in Chap- 
ter 1. In the remainder of the chapter, we combine the notions of multilinear 
algebra with results about differentiation from Chapter 2 to define and study 
differential forms in R n . Differential forms and their operators are what are 
used to replace vector and scalar fields and their operators — grad, curl, and 
div — when one passes from R 3 to R n . 

In the succeeding chapter, additional topics, including integration, man- 
ifolds, and the change of variables theorem, will be brought into the picture, 
in order to treat the generalized version of Stokes’ theorem in R”. 
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§26. MULTILINEAR ALGEBRA 


Tensors 

Definition. Let V be a vector space. Let V k — V x • • • x V denote the 
set of all fc-tuples (vi, . . . , v*) of vectors of V. A function / : V k — * R is 
said to be linear in the i th variable if, given fixed vectors \j for j / i, the 
function T : V — • ► R defined by 

T(\) = f(\ u ..., v<_ i,v,v j+1 , ...» v t ) 

is linear. The function / is said to be multilinear if it is linear in the I th 
variable for each i. Such a function / is also called a k- tensor, or a tensor of 
order k ) on V . We denote the set of all ^-tensors on V by the symbol C k {V). 
If k — 1, then C l {V) is just the set of all linear transformations / : V — ► R. 
It is sometimes called the dual space of V and denoted by V*. 

How this notion of tensor relates to the tensors used by physicists and 
geometers remains to be seen. 

Theorem 26.1. The set of all k-tensors on V constitutes a vector 
space if we define 

(/ + flf)( v 1, Vfc) = /(V i, . Vjfe) + 0(vi, ... , v*), 

(c/)(vi, . . . , v fc ) = c(/(v i, . . . , V*)). 

Proof The proof is left as an exercise. The zero tensor is the function 
whose value is zero on every fc-tuple of vectors. □ 

Just as is the case with linear transformations, a multilinear transforma- 
tion is entirely determined once one knows its values on basis elements. That 
we now prove. 

Lemma 26.2. Let a h a„ be a basis for V. If f,g : V k — ► R are 
k-tensors on V, and if 


f ( a * 1 9 • • • » a *'k ) — <7( a »'i » • • • » a *'k ) 

for every k-tuple I = (ii, ..., it) of integers from the set {1, ..., n), 
then f = g. 
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Note that there is no requirement here that the integers t’i, ..., it be 
distinct or arranged in any particular order. 

Proof. Given an arbitrary fc-tuple (vi, . . . , v*) of vectors of V, let us 
express each v* in terms of the given basis, writing 

n 

v « = XI c, i a >* 

J=1 


Then we compute 


n 

/(V 1, V fc ) = C lJi /( a ji> V 2> •••» V fc) 

>i = l 

n n 

— XI XI Cljl C2j3 /( a Ji ’ a J2’ V3 ’ * * • » V *)> 

>1 = 1 >3 = 1 

and so on. Eventually we obtain the equation 

/(vi, Vfc) = XI Cl >i C2 > 3 ‘ * ‘ Ck h /( a >i » • • • > a jJ- 
1 <>i jk<n 

The same computation holds for g. It follows that / and g agree on all 
fc-tuples of vectors if they agree on all ^-tuples of basis elements. □ 

Just as a linear transformation from V to W can be defined by specifying 
its values arbitrarily on basis elements for V, a k- tensor on V can be defined 
by specifying its values arbitrarily on ^-tuples of basis elements. That fact is 
a consequence of the next theorem. 


Theorem 26.3. Let V be a vector space with basis ai, . . . , a n . Let 
I — (ii, . . . , it) be a k-tuple of integers from the set { 1, . . . , n}. There is 
a unique k-tensor <j>j on V such that, for every k-tuple J = (j\, . . . , jjf) 
from the set {1, . . . , n }, 


(*) 


^/( a > i » * ‘ ‘ i a >fc ) — 


0 if I ^ J , 

1 if I =J. 


The tensors <£/ form a basis for £ k (V). 


The tensors (f>i are called the elementary ^-tensors on V corresponding 
to the basis ai, . . . , a n for V. Since they form a basis for £ k (V) and since 
there are n k distinct fc-tuples from the set {1, . . . , n], the space C k (V) must 
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have dimension n k . When k = 1, the basis for V * formed by the elementary 
tensors <j> u . . . , <j> n is called the basis for V * dual to the given basis for V . 


Proof. Uniqueness follows from the preceding lemma. We prove exis- 
tence as follows: First, consider the case k — 1. We know that we can deter- 
mine a linear transformation (f>i : V — ► R by specifying its values arbitrarily 
on basis elements. So we can define </>,• by the equation 


4>i 



0 

1 


if 

if i = j- 


These then are the desired 1-tensors. In the case k > 1, we define <f>i by the 
equation 

<Mvi,...,v*) = [4>i ! ( v i )] •[0i 2 (v 2 )]---[0u(vO]- 

It follows, from the facts that (1) each (f>i is linear and (2) multiplication is 
distributive, that (j>i is multilinear. One checks readily that it has the required 
value on (a ;i , • • • , &j k ). 

We show that the tensors <f>i form a basis for £ k (V). Given a k - tensor / 
on F, we show that it can be written uniquely as a linear combination of the 
tensors (f>i . For each &-tuple / = (ij, . . . , i*), let d[ be the scalar defined by 
the equation 

di = f (a*, » • * • i a »'fe )♦ 

Then consider the fc-tensor 

9 = d J 4> J , 

j 

where the summation extends over all fc-tuples J of integers from the set 
{1, n }. The value of g on the &-tuple (a; M ...,a,- fc ) equals d/, by (*), 

and the value of / on this fc-tuple equals the same thing by definition. Then 
the preceding lemma implies that f = g. Uniqueness of this representation 
of / follows from the preceding lemma. □ 


It follows from this theorem that given scalars dj for all /, there is exactly 
one fc-tensor / such that /(a*, , . . . , a i k ) = dj for all I. Thus a fc-tensor may 
be defined by specifying its values arbitrarily on fc-tuples of basis elements. 
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EXAMPLE 1 . Consider the case V = R n . Let ei, ... , e n be the usual basis 
for R n ; let <f)i , . . . , <p n be the dual basis for C 1 ( V ). Then if x has components 
X \ , . . . , x n , we have 

0i(x) = <p(xiGi + f- x n e„) = Xi. 

Thus <f>i : R n — ► R equals projection onto the I th coordinate. 

More generally, given I — (ii, , ik), the elementary tensor <j)i satisfies 
the equation 


<M x i> ..., x fc ) = <^ ll (x 1 )...^, fc (x*). 


Let us write X — [xi * • * x*], and let X tJ denote the entry of X in row i and 
column j. Then Xj is the vector having components X\j , ..., x n j • In this 
notation, 

, . . . , Xfc) = Xi x i Xi^ 2 ’ ‘ * 

Thus <pi is just a monomial in the components of the vectors xi , . . . , X*; and 
the general fc-tensor on R n is a linear combination of such monomials. 

It follows that the general 1-tensor on R n is a function of the form 


/(x) = dix i + • •• + d n x n) 


for some scalars d x . The general 2-tensor on R n has the form 


n 

fif(x,y) = d h x iVji 
» , j = i 


for some scalars d tJ . And so on. 


The tensor product 

Now we introduce a product operation into the set of all tensors on V. 
The product of a fc-tensor and an ^-tensor will be a k + t tensor. 

Definition. Let / be a A?-tensor on V and let g be an ^-tensor on V . 
We define a k + i tensor f g on V by the equation 

(/ ®flO(vi, v*+/) = /(v 1? ..., v*) -g{vk+u v fc+/ ). 

It is easy to check that the function f®g is multilinear; it is called the tensor 
product of / and g. 

We list some of the properties of this product operation: 
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Theorem 26.4. Let /, g, h be tensors on V. Then the following 
properties hold: 

(1) (Associativity), f <g> (g <g> h) = (/ 0 g) <g> h. 

(2) (Homogeneity), (cf) <g> g = c(f <g> g) = / ®(cg). 

(3) (Distributivity). Suppose f and g have the same order. Then: 

(f + g)®h = f®h + g®h, 
h®(f + g)-h®f + h®g. 

(4) Given a basis a l5 . . . , a„ for V, the corresponding elementary 
tensors (j>[ satisfy the equation 

4>i = <t>ii <8> 4>h ® ® <J>i k > 


where I = (i 1? . . . , i k ). 

Note that no parentheses are needed in the expression for (f>j given in (4), 
since <S> is associative. Note also that nothing is said here about commutativ- 
ity. The reason is obvious; it almost never holds. 

Proof. The proofs are straightforward. Associativity is proved, for in- 
stance, by noting that (if /, g, h have orders respectively) 

(/ ®(g® h )) (V ls . . . , Vk+l+m ) 

= f ( V 1 5 • • • ? v fc) ' fl r ( v fc+l » ■ • • » V fc+^) ’ h(v k +l+li • * • » v fc4-£+m)* 


The value of (/ (g> g) <g> h on the given tuple is the same. □ 

The action of a linear transformation 

Finally, we examine how tensors behave with respect to linear transfor- 
mation of the underlying vector spaces. 

Definition. Let T : V — * W be a linear transformation. We define the 
dual transformation 


T m : C k (W) - C k (V), 

(which goes in the opposite direction) as follows: If / is in C k (W), and if 
v i , . . . , Vi are vectors in V, then 


(T*f)(v i, vk) = /(T(vi), ...,T(v*)). 
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The transformation T* is the composite of the transformation T x • • • x T 
and the transformation /, as indicated in the following diagram: 



It is immediate from the definition that T* f is multilinear, since T is linear 
and / is multilinear. It is also true that T * itself is linear, as a map of 
tensors, as we now show. 

Theorem 26.5. Let T : V —> W be a linear transformation; let 

T* :C k (W)-+£ k (V) 

be the dual transformation. Then: 

(1) T* is linear. 

(2) T*(f®g) = T'f®T*g. 

(3) If S : W —* X is a linear transformation, then ( S o T)* f — 
T*(S*f). 

Proof. The proofs are straightforward. One verifies (1), for instance, as 
follows: 

(T'(af + bg)) (v„...,v t ) = (af + bg) (T( Vl ), T(v t )) 

= a/(T(vi), .... T(v*)) + bg(T( Vl ), T(v*)) 

= aT’f(v i, v t ) + bT‘g(\ u v t ), 

whence T*(af -{- bg) = aT* f + bT* g. □ 

The following diagrams illustrate property (3): 


W C k (W) 



SoT 


(SoT) 
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EXERCISES 

1. (a) Show that if /, g : V k — ► R are multilinear, so is af 4- bg. 

(b) Check that C k {V) satisfies the axioms of a vector space. 

2. (a) Show that if / and g are multilinear, so is / ® g. 

(b) Check the basic properties of the tensor product (Theorem 26.4). 

3. Verify (2) and (3) of Theorem 26.5. 

4. Determine which of the following are tensors on R 4 , and express those 
that are in terms of the elementary tensors on R 4 : 

/(x, y) = 3zi y 2 + 5 x 2 X 3 , 

g(x,y) - xiy 2 + x 2 yi +1, 

h(x, y) = x x yi - lx 2 y z . 

5. Repeat Exercise 4 for the functions 

/(x,y,z) = 3 X!X 2 Z 3 - XzyiZi, 
g(x, y,z,u,v) = 5 x 3 y 2 z 3 U 4 V 4 , 
h(x, y, z) = x-iy 2 Z4 + 2 xiZ 3 . 

6. Let / and g be the following tensors on R 4 : 

/(x, y, z) = 2x\y 2 z 2 - x 2 y 3 Z\, 
g = <t> 2,1 — 5<f> 3,1 . 

(a) Express f ® g as a linear combination of elementary 5-tensors. 

(b) Express (/ ® g) (x, y, z, u, v) as a function. 

7. Show that the four properties stated in Theorem 26.4 characterize the 
tensor product uniquely, for finite-dimensional spaces V. 

8. Let / be a 1-tensor on R n ; then f(y) — A-y for some matrix A of size 1 
by n. If T : R m — * R n is the linear transformation T(x) — B x, what is 
the matrix of the 1-tensor T* f on R m ? 


§27. ALTERNATING TENSORS 


In this section we introduce the particular kind of tensors with which we shall 
be concerned — the alternating tensors — and derive some of their properties. 
In order to do this, we need some basic facts about permutations. 
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Permutations 

Definition. Let k > 2. A permutation of the set of integers {1, . . . , k} 
is a one-to-one function a mapping this set onto itself. We denote the set of 
all such permutations by 5*. If cr and r are elements of 5*, so are got and 
cr” 1 . The set Sk thus forms a group, called the symmetric group (or the 
permutation group) on the set {1, . . . , k}. There are k\ elements in this 
group. 

Definition. Given l < i < k, let e* be the element of Sk defined by 
setting ei(j) = j for j ^ i,*‘ + 1; and 

e,(i) = i + 1 and e,-(* + 1) = i. 

We call e t - an elementary permutation. Note that e, o e* equals the identity 
permutation, so that e, is its own inverse. 

Lemma 27.1. If a € Sk, then a equals a composite of elementary 
permutations . 

Proof. Given 0 < i < k, we say that o fixes the first i integers if 
a(j) = j for 1 < j < i. If i = 0, then o need not fix any integers at all. 
If i = k } then a fixes all the integers 1 so that a is the identity 

permutation. In this case the theorem holds, since the identity permutation 
equals o ej for any j . 

We show that if a fixes the first i — 1 integers (where 0 < i < k), then a can 
be written as the composite a = 7T o o', where 7T is a composite of elementary 
permutations and o' fixes the first i integers. The theorem then follows by 
induction. 

The proof is easy. Since o fixes the integers 1, . . . , i — 1, and since G is 
one-to-one, the value of G on i must be a number different from 1, . . . , i — 1. 
If G(i) = i, then we set g* — o and tt equal to the identity permutation, and 
our result holds. If G(i) = i > i , we set 

g' — e,- o • • • o i o g. 

Then & fixes the integers 1, . . . , i — 1 because G fixes these integers and so 
do e,-, . . . , e*_i. And o' also fixes i, since g(i) — l and 

•••) = *. 

We can rewrite the equation defining g' in the form 

e*_i o • ■ • o e, o G i — G\ 


thus our result holds. □ 
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Definition. Let a £ Sk- Consider the set of all pairs of integers i,j 
from the set {1, . . . , k} for which i < j and a(i) > cr(j). Each such pair is 
called an inversion in a . We define the sign of a to be the number —1 if the 
number of inversions in a is odd, and to be the number 4-1 if the number of 
inversions in a is even. We call a an odd or an even permutation according 
as the sign of o equals —1 or 4-1, respectively. Denote the sign of a by sgn a. 

Lemma 27.2. Let <t,t £ S'*. 

(a) If a equals a composite of m elementary permutations , then 
sgn a = (-l) m . 

(b) sgn(tr or) = (sgn a) ■ (sgn r). 

(c) sgn cr~ 1 = sgn a. 

(d) If p / q, and if r is the permutation that exchanges p and q and 
leaves all other integers fixed, then sgn r = — 1. 

Proof. Step 1. We show that for any a , 

sgn(<? o et) — —sgn a. 

Given cr, let us write down the values of a in order as follows: 

(*) (<7(1), <r (2)> • • • , <?((), <7(1+ 1), • • • , <7(k)). 

Let r = croe*; then the corresponding sequence for T is the fc-tuple of numbers 
(r(l), r( 2 ), . . . , t(1), r{i + 1 ), . . . , r{k)) 

(**) 

= (< t ( 1 ),< t ( 2 ), < 7 (^ 4 - \),<j(t), cr(k)). 

The number of inversions in <7 and r, respectively, are the number of pairs of 
integers that appear in the sequences (*) and (**), respectively, in the reverse 
of their natural order. We compare inversions in these two sequences. Let 
p ^ q- we compare the positions of cr(p ) and <j(q ) in these two sequences. 
If neither p nor q equals l or i 4- 1, then cr(p) and cr(q) appear in the same 
slots in both sequences, so they constitute an inversion in one sequence if and 
only if they constitute an inversion in the other. Now consider the case where 
one, say p, equals either I or t 4- 1, and the other q is different from both t 
and 1 4-1. Then a (q) appears in the same slot in both sequences, but cr(p) 
appears in the two sequences in adjacent slots. Nevertheless, it is still true 
that cr(p) and <r(q) constitute an inversion in one sequence if and only if they 
constitute an inversion in the other. 

So far the number of inversions in the two sequences are the same. But 
now we note that if a{i) and cr{t 4-1) form an inversion in the first se- 
quence, they do not form an inversion in the second; and conversely. Hence 
sequence (**) has either one more inversion, or one fewer inversion, than (*). 
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Step 2 . We prove the theorem. The identity permutation has sign +1; 
and composing it successively with m elementary permutation changes its 
sign m times, by Step 1. Thus (a) holds. To prove (b), we write a as 
the composite of m elementary permutations, and r as the composite of n 
elementary permutations. Then o o r is the composite of m + n elementary 
permutations; and (b) follows from the equation (— l) m+n = (— l) m (— l) n . 
To check (c), we note that since o~ l o a equals the identity permutation, 
(sgn <r -1 )(sgn a) = 1. 

To prove (d), one simply counts inversions. Suppose that p < q. We can 
write the values of r in order as 

(1, ..., p -!,[?], P+1, ...,p + *-l,[p],p + *+l, ..., fc), 

where q = p + l. Each of the pairs {<7,p+l}, . .. , {<7,p + ^- 1} constitutes an 
inversion in this sequence, and so does each of the pairs {p+ l,p}. . . . , {p + 
t — l,p}. Finally, {q,p} is an inversion as well. Thus r has 2^ — 1 inversions, 
so it is odd. □ 

Alternating tensors 

Definition. Let / be an arbitrary fc-tensor on V. If <7 is a permutation 
of {1, . . . , k}, we define f a by the equation 

f ( v l, • • • , v fc) — f {y o( 1), • * * , v <r( Jb) ) • 

Because / is linear in each of its variables, so is f a \ thus f° is a fc-tensor 
on V. The tensor / is said to be symmetric if f e — f for each elemen- 
tary permutation e, and it is said to be alternating if f e = —f for every 
elementary permutation e. 

Said differently, / is symmetric if 

/(V 1, •••, V f+ i,Vj, ..., V fc ) = /(Vj, ..., Vi,V i+ i, ..., v fc ) 

for all i\ and / is alternating if 

/(Vl, ...» Vj+i,Vi, - Vfc) = f (y 1 , - V,-,V< +1 , . Vfc). 

While symmetric tensors are important in mathematics, we shall not be con- 
cerned with them here. We shall be primarily interested in alternating tensors. 

Definition. If V is a vector space, we denote the set of alternating k- 
tensors on V by A k (V). It is easy to check that the sum of two alternating 
tensors is alternating, and that so is a scalar multiple of an alternating tensor. 
Then A k (V) is a linear subspace of the space C k {V) of all fc-tensors on V. 
The condition that a 1-tensor be alternating is vacuous. Therefore we make 
the convention that A l (V) = C l (V). 
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EXAMPLE 1. The elementary tensors of order k > 1 are not alternating, but 
certain linear combinations of them are alternating. For instance, the tensor 


/ — 0»,j “ 0j>» 


is alternating, as you can check. Indeed, if V = R” and we use the usual basis 
for R n and corresponding dual basis <f>i, the function / satisfies the equation 


/(x,y) = X x y 3 -x 3 yi =det 


x t 


y i 

y>. 


Here it is obvious that /( y, x) = -/(x, y). Similarly, the function 



'x, 

y. 

Zt ■ 

p(x, y,z) = det 


y 3 

Z J 


.Xk 

yk 

Zk. 


is an alternating 3-tensor on R n ; one can also write g in the form 


9 ~~ fik.i, j ~~ ~ fii.k,] »■ 


This example suggests that alternating tensors and the determinant func- 
tion are intimately related. This is in fact the case, as we shall see. 

We now study the space A k (V)\ in particular, we find a basis for it. Let 
us begin with a lemma: 

Lemma 27.3. Let f be a k-tensor on V; let o,r G S\.. 

(a) The transformation f -* f a is a linear transformation of C k (V) 
to £ k (V). It has the property that for all cr y T, 

^f a ^ T jroo 

(b) The tensor f is alternating if and only if f° = (sgn cr)f for all cr. 
If f is alternating and if v p = v q with p ^ q, then /(v 1? . . . , v*) = 0. 

Proof, (a) The linearity property is straightforward; it states simply 
that {af 4- bg) a = af° -f- bg a . To complete the proof of (a), we compute 

(n T (vi, . . . , v*) = r(v r( D, . . . , v T(fc) ) 

= / ff (wi,...,wi), where w, = 

— /( w <r(l), • • ■ i 

— /( v r(<r(l))> * • * ? v r(cr(i))) 

= r° ( V1 ,...,v fc ). 
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(b) Given an arbitrary permutation a , let us write it as the composite 

a = cr 1 o cr 2 o • • • o cr m , 

where each <7, is an elementary permutation. Then 
f° — y^io-o cr m 

= ((-(D-)T by (a), 

= (— l) m / because / is alternating, 

= (sgn <r)/. 

Now suppose v p = v g for p q. Let r be the permutation that ex- 
changes p and q. Since v p = v q , 

/ r (vi, . Vjfc) = /(v 1 , . . . , Vfc). 

On the other hand, 

/ T (Vl, •••, Vjb)= -/(v 1? V*) 

since sgn r = -1. It follows that /(vi, v fc ) = 0. □ 

We now obtain a basis for the space A k (V). There is nothing to be done 
in the case k — 1, since A l {V) = C l (V). And in the case where k > n, 
the space A (V) is trivial. For any fc-tensor f is uniquely determined by 
its values on ^-tuples of basis elements. If k > n, some basis element must 
appear in the fc-tuple more than once, whence if / is alternating, the value 
of / on the fc-tuple must be zero. 

Finally, we consider the case 1 < k < n. We show first that an alternating 
tensor / is entirely determined by its values on ^-tuples of basis elements 
whose indices are in ascending order. Then we show that the value of / 
on such ^-tuples may be specified arbitrarily. 

Lemma 27.4. Let a x , . . . , a n be a basis for V. If f,g are alternat- 
ing k-tensors on V, and if 

f ( a *i ■>'•••) a ifc) = ? • • • j a i*) 

for every ascending k-tuple of integers I = («i, ..., i k ) from the set 
{1, ..., n }, then f = g. 

Proof In view of Lemma 26.2, it suffices to prove that / and g have 
the same values on an arbitrary fc-tuple (a^ , . . . , a j k ) of basis elements. Let 

J = Uu 
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If two of the indices, say, j p and j q , are the same, then the values of / 
and g on this tuple are zero, by the preceding lemma. If all the indices 
are distinct, let <7 be the permutation of {1, . k} such that the fc-tuple 
I = (jo (i), • • . , j<T(k)) is ascending. Then 

/(«(,, = /"(a,-,, a,,) by definition of/', 

= (sgn cr)f(etj 1 , . . . , a Jfc ) because / is alternating. 

A similar equation holds for g. Since f and g agree on the fc-tuple 
(a,*!, . . . , a ifc ), they agree on the fc-tuple (a ;i , . . . , a Jfc ). □ 

Theorem 27.5. Let V be a vector space with basis ai, . . . , a n . Let 
/=(*!,..., i k ) be an ascending k-tuple from the set {1, . .., n }. There 
is a unique alternating k-tensor V 7 / on V such that for every ascending 
k-tuple J = (ji, . . . , j k ) from the set {1, . . . , n }, 

f 0 if I ^ J, 
V>/(a j „...,a J J=| i , f j = j 

The tensors rpf form a basis for A k (V). The tensor in fact satisfies 
the formula 

- Yj (sgn <*)(</> l)*> 

o 

where the summation extends over all o E 5fc. 


The tensors ipi are called the elementary alternating fc-tensors on V 
corresponding to the basis ai , . . . , a n for V. 

Proof Uniqueness follows from the preceding lemma. To prove exis- 
tence, we define by the formula given in the theorem, and show that ij)i 
satisfies the requirements of the theorem. 

First, we show \j)[ is alternating. If T € S k , we compute 

(t/),) T = ^ ( s g n a ) MiTV b y linearity, 


= £ (sgn <7) 

O 

~ (sgn r) ^2 ( s S n ( r o o)) (<f>i) TOa 


= (sgn r)^/; 
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the last equation follows from the fact that Too ranges over St as a does. 
We show has the desired values. Given J , we have 

a 

Now at most one term of this summation can be non-zero, namely the term 
corresponding to the permutation a for which I = • • . , jo(k))- Since 

both I and J are ascending, this occurs only if I = J and a is the identity 
permutation, in which case the value is 1. If I ^ J, then all terms vanish. 

Now we show the form a basis for A k (V). Let / be an alternating fc- 
tensor on V . We show that / can be written uniquely as a linear combination 
of the tensors i />/. 

Given /, for each ascending fc-tuple / = (i 1} . . . , it) from the set 
{1, . . . , n}, let dj be the scalar 

dj — /( a ij ? • ■ ■ ) a ijt)* 

Then consider the alternating fc-tensor 

g = Y^ 

[J] 

where the notation [J] indicates that the summation extends over all ascend- 
ing fc-tuples from {1, . . . , n}. If I is an ascending fc-tuple, the the value of g 
on the fc-tuple (a^, . . . , a ,- fc ) equals dj; and the value of / on this fc-tuple is 
the same. Hence / = g. Uniqueness of this representation of / follows from 
the preceding lemma. □ 


This theorem shows that once a basis ai, . . . , a n for V has been chosen, 
an arbitrary alternating Ar-tensor / can be written uniquely in the form 

ij] 

The numbers dj are called components of f relative to the basis {^j}- 
What is the dimension of the vector space A k (V)7 If k = 1, then A 1 (V r ) 
has dimension n , of course. In general, given k > 1 and given any subset of 
{1, ..., n) having k elements, there is exactly one corresponding ascending 
fc-tuple, and hence one corresponding elementary alternating fc-tensor. Thus 
the number of basis elements for A k (V) equals the number of combinations 
of n objects, taken k at a time. This number is the binomial coefficient 


(n \ _ n\ 

k\(n — &)! * 
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The preceding theorem gives one formula for the elementary alternating 
tensor \j)j. There is an alternative formula that expresses ipj directly in terms 
of the standard basis elements for the larger space C k {V). It is given in 
Exercise 5. 

Finally, we note that alternating tensors behave properly with respect to 
a linear transformation of their underlying vector spaces. The proof is left as 
an exercise. 

Theorem 27.6. Let T : V — > W be a linear transformation. 
If f is an alternating tensor on W, then T*f is an alternating tensor 
on V. □ 

Determinants 

We now (at long last!) construct the determinant function for matrices 
of size greater than 3 by 3. 

Definition. Let ei, . . . , e„ be the usual basis for R"; let <f> i, . . • , 
denote the dual basis for /^(R"). The space ,4 n (R n ) of alternating n-tensors 
on R n has dimension 1; the unique elementary alternating rc-tensor on R n is 
the tensor V , i, ...,n- If X = [xi • • • x n ] is an n by n matrix, we define the 
determinant of X by the equation 

det X = ..., x„). 

We show this function satisfies the axioms for the determinant function 
given in §2. For convenience, let us for the moment let g denote the function 

g(X) = ^/(xi, ..., x„), 

where I = The function g is multilinear and alternating as a 

function of the columns of X, because ifti is an alternating tensor. Therefore 
the function / defined by the equation f(A) = g(A tr ) is multilinear and 
alternating as a function of the rows of the matrix A. Furthermore, 

f(I n ) = g(I n ) = <Me l, e„) = 1. 

Hence the function / satisfies the axioms for the determinant function. In 
particular, it follows from Theorem 2.11 that f(A) — f(A tr ). Then f(A) = 
f(A tr ) = <7((j4 tr ) tr ) = g(A ), so that g also satisfies the axioms for the deter- 
minant function, as desired. 

The formula for t/)j given in Theorem 27.5 gives rise to a formula for the 
determinant function. If I = (1, . . . , 7i), we have 

det X = ^2 ( s S n o‘)0/(x <7 ( i), • • • , x fl („)) 

a 

= ^2 ( S S n <7)^1, *(1) • 2-2, <7(2) • • • %n,< 7(n)7 
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as you can check. This formula is sometimes used as the definition of the 
determinant function. 

We can now obtain a formula for expressing if)[ directly as a function of 
^-tuples of vectors of R n . It is the following: 

Theorem 27.7. Let be an elementary alternating tensor on R n 
corresponding to the usual basis for R”, where I = (*i, .. ., i k ). Given 
vectors xi, . . . , x fc of R", let X be the matrix X — [xi • • • x*]. Then 

^/(xi, , Xfc) = det Xiy 

where Xi denotes the matrix whose successive rows are rows 
of X. 

Proof. We compute 

i, ..., Xfc) = ^(sgn <7)<^ / (x <y(1) , ..., x a( fc)) 
a 

— ( S S n <J ) X ii,o{\) ' x h,a(2) • ' * x ik,<r(k) • 
a 

This is just the formula for det Xj. □ 

EXAMPLE 2. Consider the space .4 3 (R 4 ). The elementary alternating 3- 
tensors on R 4 , corresponding to the usual basis for R 4 , are the functions 



' Xi 

Vi 

Zi ‘ 

^ l>J( *(x,y, z ) = det 

*3 

Vj 

h 


_x k 

2/fc 

Zk. 


where (i,j,k) equals (1,2,3) or (1,2,4) or (1,3,4) or (2,3,4). The general ele- 
ment of .4 3 (R 4 ) is a linear combination of these four functions. 

A remark on notation. There is in the subject of multilinear algebra a 
standard construction called the exterior product operation. It assigns to any 
vector space W a certain quotient of the “fc-fold tensor product” of W; this 
quotient is denoted A fc (kF) and is called the “fc-fold exterior product” of W. 
(See [Gr], [N].) If V is a finite-dimensional vector space, then the exterior 
product operation, when applied to the dual space V* — £ 1 (V'), gives a space 
A fc (V*) that is isomorphic to the space of alternating fc-tensors on V , in a 
natural way. For this reason, it is fairly common among mathematicians to 
abuse notation and denote the space of alternating fc-tensors on V by A fc (V r *). 
(See [B-G] and [G-P], for example.) 

Unfortunately, others denote the space of alternating fc-tensors on V by 
the symbol A*(F) rather than by A fc (V*). (See [A-M-R], [B], [D].) Other 
notations are also used. (See [F], [S].) Because of this notational confusion, 
we have settled on the neutral notation A k (V) for use in this book. 
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EXERCISES 

1. Which of the following are alternating tensors in R 4 ? 

/(x, y) = xit /2 - x 2 yi -h ar i y i - 
<?(x, y) = x 1 y 3 - x z y 2 . 
h(x,y) = (®i ) 3 ( 2 / 2 ) 3 - (* 2 ) 3 (y0 3 . 

2. Let c G Sb be the permutation such that 

(<r(l), cr(2), (7(3), <7(4), (7(5)) = (3, 1,4, 5, 2). 

Use the procedure given in the proof of Lemma 27.1 to write <7 as a 
composite of elementary permutations. 

3. Let ipi be an elementary fc-tensor on V corresponding to the basis 
ai , . . . , a„ for V. If j \ , . . . , jk is an arbitrary fc-tuple of integers from 
the set {1, . . . , n}, what is the value of 

*pi{ a ix, • •• » a.;*)? 

4. Show that if T : V — ► W is a linear transformation and if / G A k (W) } 
then T*f e A k (V). 

5. Show that 

if>i = 

where if / — (* 1 , ..., **), we let I a = (V(j)> l «r(fe))* [Hint: Show 

first that ) ff = 4>i-] 


§28. THE WEDGE PRODUCT 


Just as we did for general tensors, we seek to define a product operation in the 
set of alternating tensors. The product f®g is almost never alternating, even 
if f and g are alternating. So something else is needed. The actual definition 
of the product is not very important; what is important are the properties it 
satisfies. They are stated in the following theorem: 
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Theorem 28.1. Let V be a vector space. There is an operation 
that assigns, to each f £ A k (V) and each g £ A l (V), an element f Ag £ 
A k+t (V), such that the following properties hold: 

(1) (Associativity), f A (g A h) = (f A g) A h. 

(2) (Homogeneity). ( cf ) A g = c(f A g) = / A (eg). 

(3) (Distributivity). If f and g have the same order , 

(f + g) Ah = f Ah + g Ah, 

h A (f + g) = h A f h A g. 

(4) (Anticommutativity). If f and g have orders k and i , respec- 
tively, then 

g a / = ( -l) tl fAg . 

(5) Given a basis ai, . . . , a n for V, let <f>i denote the dual basis for 
V* , and let ij)i denote the corresponding elementary alternating 
tensors . If I — (i\, ik) is an ascending k-tuple of integers 
from the set {1, . . . , n), then 

*1>I = <t>ii A (f>i 2 A • •• A 4>i k . 

These five properties characterize the product A uniquely for finite- 
dimensional spaces V . Furthermore, it has the following additional 
property: 

(6) If T : V — ► W is a linear transformation, and if f and g are 
alternating tensors on W, then 

T'(fAg) = T*fAT*g. 


The tensor / A g is called the wedge product of / and g. Note that 
property (4) implies that for an alternating tensor / of odd order, / A/ = 0. 

Proof. Step 1 . Let F be a k- tensor on W (not necessarily alternat- 
ing). For purposes of this proof, it is convenient to define a transformation 
A : C k (V) — ► C k (V) by the formula 

AF = '£(s g nv)F°, 

a 

where the summation extends over all o £ Sk- (Sometimes a factor of 1 /k\ 
is included in this formula, but that is not necessary for our purposes.) Note 
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that in this notation, the definition of the elementary alternating tensors can 
be written as 

ip i = A<pi. 

The transformation A has the following properties: 

(i) A is linear. 

(ii) AF is an alternating tensor. 

(iii) If F is already alternating, then AF = (k\)F. 

Let us check these properties. The fact that A is linear comes from the 
fact that the map F -► F a is linear. The fact that AF is alternating comes 
from the computation 

(AF) T = £^(sgn cr){F a y by linearity, 

a 

= ][>gn a)F™ 

a 

= (sgn r) 53 ( s S n T ° °)F Toa 

a 

— (sgn t)AF. 

(This is the same computation we made earlier in showing that ipi is alter- 
nating.) Finally, if F is already alternating, then F a = (sgn &)F for all a. 
It follows that 

AF = (sgn a ) 2 F = (k\)F. 

a 

Step 2. We now define the product fAg. If / is an alternating fc-tensor 
on V y and g is an alternating ^-tensor on F, we define 

fA 9 = m A(f ® 9) - 


Then / A g is an alternating tensor of order k 4- i. 

It is not entirely clear why the coefficient 1 fk\t\ appears in this formula. 
Some such coefficient is in fact necessary if the wedge product is to be asso- 
ciative. One way of motivating the particular choice of the coefficient 1 /k\l\ 
is the following: Let us rewrite the definition of / A g in the form 

(/ A0)(vi, = 

~ 53 (sgn tf)/(v<x( 1 ), .. v <jr(jb )) ^(v^jt+i), . .. , v ff(fc+/) ). 
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Then let us consider a single term of the summation, say 


(sgn a)/(v„ (1) , v, ( *)) . . . , v a(fc+£) ). 

A number of other terms of the summation can be obtained from this one 
by permuting the vectors , v<,( k ) among themselves, and permuting 

the vectors . . . , among themselves. Of course, the factor 

(sgn cr) changes as we carry out these permutations, but because f and g are 
alternating, the values of / and g change by being multiplied by the same 
sign. Hence all these terms have precisely the same value. There are k\i\ such 
terms, so it is reasonable to divide the sum by this number to eliminate the 
effect of this redundancy. 

Step 3 . Associativity is the most difficult of the properties to verify, so 
we postpone it for the moment. To check homogeneity, we compute 


(cf)Ag = A((cf) 0 g)/k\t\ 

= A(c(f 0 g))/k\£\ by homogeneity of 0, 

= cA(f 0 g)/k\£\ by linearity of A , 

= C(/A0). 

A similar computation verifies the other part of homogeneity. Distributivity 
follows similarly from distributivity of 0 and linearity of A. 

Step We verify anticommutativity. In fact, we prove something 
slightly more general: Let F and G be tensors of orders k and £, respec- 
tively (not necessarily alternating). We show that 

A(F0G') = (-1) W A(G'0/ 1 ). 

To begin, let 7 r be the permutation of (1, . . . , k -f i) such that 

(*■(1), ...,*•(* + /))= (fc+l,fc + 2, fc + <, 1,2, .... fc). 

Then sgn 7r — (— l) fc *. (Count inversions!) It is easy to see that (G 0 F) T = 
F 0 G, since 


(G 0 F) x (y 1, . . . , Vk+t) = ^(Vfc+i, . . . , v k+t ) - F(vi, . . . , v fe ), 
(F®G)(v 1, ..., V i+< ) = F(v 1, Vfc) • Cr(v fc+ 1, ..., Vfc+z). 
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We then compute 

A(F 0 G) = ^2 ( s S n a )(F ® G)° 

a 

= £(sg nv)((G®F)’Y 

a 

= (sgn 7r) ^ (sgn a o 7r)(G 0 F) ao * 

a 

= (sgn 7r)>l(G' 0 .F), 

since <7 o 7T runs over all elements of S k +t as a does. 

Step 5. Now we verify associativity. The proof requires several steps, 
of which the first is this: 

Let F and G be tensors (not necessarily alternating) of orders k and 
respectively, such that AF = 0. Then A(F 0 G) = 0. 

To prove that this result holds, let us consider one term of the expression 
for A(F <S> G), say the term 


(sgn <r)F(v a(l) , . . . , v <r(jk) ) • G(v a(k+1) , . . . , v <r(jb+0 ). 

Let us group together all the terms in the expression for A{F (&G) that involve 
the same last factor as this one. These terms can be written in the form 

(sgn cr) (sgn t)F(v^ t ( i)), ...» v <T(r(fc )))] • , v a ( fc+ *)), 

r 

where r ranges over all permutations of {1, . . . , k}. Now the expression in 
brackets is just 

AF(v a(1) , ...» v* ( fc)), 

which vanishes by hypothesis. Thus the terms in this group cancel one an- 
other. 

The same argument applies to each group of terms that involve the same 
last factor. We conclude that A(F 0 G) = 0. 

Step 6. Let F be an arbitrary tensor and let h be an alternating tensor 
of order m. We show that 

(AF)Ah=±A(F®h). 

Let F have order k. Our desired equation can be written as 
^A{ { AF)®h) = ±A ( F®h). 
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Linearity of A and distributivity of <g> show this equation is equivalent to each 
of the equations 

A{(AF)®h-(k\)F®h} = 0 , 

A{[AF-(k\)F]®h} = 0. 

In view of Step 5, this equation holds if we can show that 

A[AF - (fc!)F] = 0. 

But this follows immediately from property (iii) of the transformation A , since 
AF is an alternating tensor of order k. 

Step 7. Let /, g , h be alternating tensors of orders fc, t, m respectively. 
We show that 

(/ A g) A h = ® g) ® h). 

Let F — / <g> g, for convenience. We have 

f A9 = k\f\ AF 

by definition, so that 

(/Aff)A/»= -j^(AF)/\h 

~ kW.m'. A ( F ® h ) by Step 6 ’ 

= m h A « f ® 9 ) ®v- 

Step 8. Finally, we verify associativity. Let /, g, h be as in Step 7. 
Then 

(k\t\m\)(f A g) A h = A((f ® g) ® h) by Step 7, 

= A(f ® (g 0 /i)) by associativity of 0, 

= h) 0 /) by Step 4, 

= (-l) fc( ' +m) (^!rn!A;!)(<7 Ah) A f by Step 7, 

= (k\t \m\)f A (g A h) by anticommutativity. 

Step 9. We verify property (5). In fact, we prove something slightly 
more general. We show that for any collection /i, of 1-tensors, 

W Mfi ® ® fk) = /l a • • • A ft. 
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Property (5) is an immediate consequence, since 

rpi = A(pj = A((f>i l 0 • • • 0 0, fc ). 

Formula (*) is trivial for k = 1. Supposing it true for k - 1, we prove it 
for k. Set F = f\ 0 ■ • • 0 fk-i- Then 

A(F 0 f k ) = (V.)(AF) A f k by Step 6, 

= (/i A • • ■ A /fc-i) A /t, 


by the induction hypothesis. 

Step 10. We verify uniqueness; indeed, we show how one can calculate 
wedge products, in the case of a finite-dimensional space V, using only prop- 
erties (l)-(5). Let <j)i and tpi be as in property (5). Given alternating tensors 
/ and g, we can write them uniquely in terms of the elementary alternating 
tensors as __ 

f ~Y2 bl & and 9~Y2 Cj ^ j - 

V) [J] 

(Here I is an ascending fc-tuple, and J is an ascending ^-tuple, from the set 
{1, . . . , n}.) Distributivity and homogeneity imply that 

/ A 9 = J2 ^2 b i c J^i A 
[/] [J] 

Therefore, to compute / A g we need only know how to compute wedge prod- 
ucts of the form 


'ipl A 'tpj ~ (fa A • • • A fa) A (<f> jl A ■ • • A fa). 


For that, we use associativity and the simple rules 

(pi A (pj = —<pj A (pi and <pi A (pi = 0, 

which follow from anticommutativity. It follows that the product tpi A ipj 
equals zero if two indices are the same. Otherwise it equals (sgn 7r) times the 
elementary alternating k + £ tensor ipK whose index is obtained by rearrang- 
ing the indices in the k + l tuple (/, J) in ascending order, where 7T is the 
permutation required to carry out this rearrangement. 

Step 11. We complete the proof by verifying property (6). Let T : 
V — ► W be a linear transformation, and F be an arbitrary tensor on W (not 
necessarily alternating). It is easy to verify that T*(F a ) — (T*F) a . Since 
X 1 * is linear, it then follows that T*(AF) = A(T*F). 
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Now let / and g be alternating tensors on W of orders k and £, respec- 
tively. We compute 

T‘(f A < 7 ) = j^yT* (A(f ® g)) 

= -^A(T'(f®g)) 

~ A((T*f) ® ( T’g )) by Theorem 26.5, 

= (T*f)A(T-g). □ 

With this theorem, we complete our study of multilinear algebra. There 
is, of course, much more to the subject (see [N] or [Gr], for example), but this 
is all we shall need. We shall in fact need only alternating tensors and their 
properties, as discussed in this section and the preceding one. 

We remark that in some texts, such as [G-P], a slightly different definition 
of the wedge product is used; the coefficient \/(k+£)\ appears in the definition 
in place of the coefficient 1 /k\£\. This choice of coefficient also leads to an 
operation that is associative, as you can check. In fact, all the properties 
listed in Theorem 28.1 remain unchanged except for (5), which is altered by 
the insertion of a factor of k\ on the right side of the equation for tpj. 

EXERCISES 

1. Let x, y, z G R 5 . Let 

F(x,y,x) = 2 x 2 y 2 z 1 -f x x y b z A , 

G(x,y) = Zij/3 +z 3 yi, 
h( w) = Wi - 2 w 3 . 

(a) Write AF and ^46? in terms of elementary alternating tensors. [Hint: 
Write F and G in terms of elementary tensors and use Step 9 of the 
preceding proof to compute A0/.] 

(b) Express ( AF ) A h in terms of elementary alternating tensors. 

(c) Express (AF)(x,y, z) as a function. 

2. If G is symmetric, show that AG — 0. Does the converse hold? 

3. Show that if fi , . . . , are alternating tensors of orders i \ , . . . , l k , re- 
spectively, then 


1 




A(fi ® • • • 0 f k ) = fi A • • ♦ A fk. 
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4. Let xi , . . . , x* be vectors in R n ; let X be the matrix X = [xi • • x*]. 
If I = (n , . . . , i k ) is an arbitrary fc-tuple from the set {l, . . . , n}, show 
that 

<j>i x A ■ ■ • A <}>i k (xi , . . . , Xfc) = det Xj . 

5. Verify that T*{F°) = (T*F) <T . 

6. Let T : R m — ► R n be the linear transformation T(x) = B ■ x. 

(a) If ipi is an elementary alternating k - tensor on R n , then T*tpj has the 
form 

T*xpi = cjipj , 

M 

where the tpj are the elementary alternating fc-tensors on R" 1 . What 
are the coefficients Cj? 

(b) If / = is &n alternating fc-tensor on R", express T* f in 

terms of the elementary alternating fc-tensors on R” 1 . 


§29. TANGENT VECTORS AND DIFFERENTIAL FORMS 

In calculus, one studies vector algebra in R 3 — vector addition, dot products, 
cross products, and the like. Scalar and vector fields are introduced; and 
certain operators on scalar and vector fields are defined, namely, the operators 

grad f = V/, curl F = V x F, and div G = V • G. 

These operators are crucial in the formulation of the basic theorems of the 
vector integral calculus. 

Analogously, we have in this chapter studied tensor algebra in R n — tensor 
addition, alternating tensors, wedge products, and the like. Now we introduce 
the concept of a tensor field; more specifically, that of an alternating tensor 
field, which is called a “differential form.” In the succeeding section, we shall 
introduce a certain operator on differential forms, called the “differential op- 
erator” d, which is the analogue of the operators grad, curl, and div. This 
operator is crucial in the formulation of the basic theorems concerning inte- 
grals of differential forms, which we shall study in the next chapter. 

We begin by discussing vector fields in a somewhat more sophisticated 
manner than is done in calculus. 
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Tangent vectors and vector fields 

Definition. Given x G R", we define a tangent vector to R n at x to 
be a pair (x; v), where v € R n . The set of all tangent vectors to R" at x 
forms a vector space if we define 

(x; v) + (x; w) = (x; v + w), 
c(x;v) = (x;cv). 

It is called the tangent space to R n at x, and is denoted 7^(R"). 

Although both x and v are elements of R n in this definition, they play 
different roles. We think of x as a point of the metric space R n and picture it 
as a “dot.” We think of v as an element of the vector space R n and picture it 
as an “arrow.” We picture (x; v) as an arrow with its initial point at x. The 
set 7^(R n ) is pictured as the set of all arrows with their initial points at x; it 
is, of course, just the set xx R", 

We do not attempt to form the sum (x; v) 4 - (y; w) if x / y. 

Definition. Let (a, b) be an open interval in R; let 7 : (a, 6 ) — > R n be 
a map of class C r . We define the velocity vector of 7 , corresponding to the 
parameter value t , to be the vector ( 7 (£); 

This vector is pictured as an arrow in R n with its initial point at the 
point p = t(^)- Se e Figure 29.1. This notion of a velocity vector is of course 
a reformulation of a familiar notion from calculus. If 


7 (t) = x(t)ei + y(t)e 2 + z(t)e 3 

is a parametrized-curve in R 3 , then the velocity vector of 7 is defined in 
calculus as the vector 


„ , dx dy dz 
Dl(t] it* 1 * dt e * + di e3 - 



Figure 29.1 
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More generally, we make the following definition: 

Definition. Let A be open in R fc or H fc ; let a : A — ► R n be of class C r . 
Let x £ A, and let p = a(x). We define a linear transformation 

a. : T X (R‘) — T P (R") 


by the equation 


a*(x;v) = (p;Da(x) ■ v). 


It is said to be the transformation induced by the differentiable map a. 


Given (x; v), the chain rule implies that the vector a*(x; v) is in fact the 
velocity vector of the curve 7 (t) = a(x + tv), corresponding to the parameter 
value t — 0. See Figure 29.2. 



Figure 29.2 


For later us?, we note the following formal property of the transforma- 
tion a*: 


Lemma 29.1. Let A be open in R fc or H fc ; let a — ► R m be of class 
C r . Let B be an open set o/R m or H m containing cx(A); let j3 : B —* R” 
be of class C r . Then 


(f3 o a)* = /?* o a*. 
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Proof. This formula is just the chain rule. Let y = a(x) and let z = 
/3(y). We compute 

(/? o a)*(x; v) = (/?(a(x));Z>(/?oa)(x) -v) 

= (P{y)\DP(y) -Da(x)-x) 

- /?*(y \Da(x)-v) 

= /?* (a*(x; v )) • a 

These maps and their induced transformations are indicated in the fol- 
lowing diagrams: 



Definition. If A is an open set in R n , a tangent vector field in A is 
a continuous function F : A — * R n x R” such that _F(x) E 7^(R"), for each 
x E A. Then F has the form F(x) = (x; /(x)), where f : A —* R n . If F is of 
class C r , we say that it is a tangent vector field of class C r . 

Now we define tangent vectors to manifolds. We shall use these notions 
in Chapter 7. 

Definition. Let M be a fc-manifold of class C r in R n . Ifp E M, choose 
a coordinate patch a : U —*V about p, where U is open in R* or H fc . Let x 
be the point of U such that a(x) = p. The set of all vectors of the form 
a*(x;v), where v is a vector in R fc , is called the tangent space to M at p, 
and is denoted T P (M). Said differently, 

T p (M) = a.(T x ( R*)). 


It is not hard to show that T p (M) is a linear subspace of 7^(R a ) that is 
well-defined, independent of the choice of a. Because R k is spanned by the 
vectors ei, . . . , e*, the space T P (M) is spanned by the vectors 

(p; Da(x.) • ej) = (p ;da/dxj), 
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for j = 1, . . . , k. Since Da has rank k , these vectors are independent; hence 
they form a basis for T P (M). Typical cases are pictured in Figure 29.3. 


We denote the union of the tangent spaces T p (M)> for p € M, by T(M)\ 
and we call it the tangent bundle of M. A tangent vector field to M 
is a continuous function F : M — ► T(M) such that -F(p) € T p (M) for each 

p € M. 

Tensor fields 

Definition. Let A be an open set in R n . A fc- tensor field in A is a 
function u assigning, to each x 6 A, a k-tensor defined on the vector space 
T x (R n ). That is, 

w(x)€£*(T x (R")) 

for each x. Thus w(x) is a function mapping ^-tuples of tangent vectors to 
R” at x into R; as such, its value on a given &-tuple can be written in the 
form 

w(x)((x;vi), ...» (x;v t )). 

We require this function to be continuous as a function of (x, v x , . . . , vt); if 
it is of class C r , we say that u is a tensor field of class C r . If it happens that 
w(x) is an alternating A:-tensor for each x, then u is called a differential 
form (or simply, a form) of order k, on A. 
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More generally, if M is an ra-manifold in R n , then we define a A:-tensor 
field on M to be a function u; assigning to each p € M an element of 
C k {'Tp(M)i). If in fact w(p) is alternating for each p, then uj is called a 
differential form on M . 

If u; is a tensor field defined on an open set of R” containing M , then u 
of course restricts to a tensor field defined on M , since every tangent vector 
to M is also a tangent vector to R”. Conversely, any tensor field on M can 
be extended to a tensor field defined on an open set of R n containing M; 
the proof, however, is decidedly non-trivial. For simplicity, we shall restrict 
ourselves in this book to tensor fields that are defined on open sets of R” . 

Definition. Let ei, . . . , e„ be the usual basis for R”. Then (x; e^, . . . , 

(x;e„) is called the usual basis for T x (R n ). We define a 1-form <f>i on R” by 
the equation 

f 0 if j, 

0 i (x)(x;e J -)= < . . 

[ 1 if i=J- 

The forms <f>i , . <f> n are called the elementary 1-forms on R”. Similarly, 
given an ascending fc-tuple / = (i'i, . . . , i*) from the set {1, . . . , n}, we define 
a Ar-form V>/ on R" by the equation 


$/(x) = <Mx)A — A^ ifc (x). 


The forms "0/ are called the elementary Ar-forms on R n . 

Note that for each x, the 1-tensors 0i(x), . . . , <j> n (x) constitute the basis 
for £ 1 (7^ t (R”)) dual to the usual basis for 7^(R n ), and the fc-tensor ^/(x) is 
the corresponding elementary alternating tensor on 7^(R n ). 

The fact that </>,■ and are of class C°° follows at once from the equations 

<£,(x)(x;v) = Vi , 

^/(x)((x; Vj), (x;v fc )) = detX/, 

where X is the matrix X = [vj ■ ■ • vt]. 

If u> is a A:-form defined on an open set A of R n , then the fc-tensor u;(x) 
can be written uniquely in the form 

w ( x ) = M X )V>/( X ), 

[/] 

for some scalar functions 6/(x). These functions are called the components 
of u relative to the standard elementary forms in R” . 
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Lemma 29.2. Let u be a k-form on the open set A of R". Then 
u is of class C r if and only if its component functions 6/ are of class 
C r on A. 

Proof. Given u>, let us express it in terms of elementary forms by the 
equation 

u> = ^2 

in 

The functions t/jj are of class C°° . Therefore, if the functions 6/ are of 
class C r , so is the function u. Conversely, if u is of class C r as a func- 
tion of (x,Vj, . .. , vjt), then in particular, given an ascending fc-tuple J = 
(j i, . . . , jk) from the set {1, . . . , n}, the function 

w ( x )(( x > e ji )» • • • » ( x > G jk )) 

is of class C r as a function of x. But this function equals 6j(x). □ 

Lemma 29.3. Let u and 7] be k-forms, and let 6 be an i-]orm, on 
the open set A of R". If lj and r) and 6 are of class C r , so are au + brj 
and oj A 6. 

Proof It is immediate that au) + br} is of class C r , since it is a linear 
combination of C r functions. To show that u A 6 is of class C r , one could 
use the formula for the wedge product given in the proof of Theorem 28.1. 
Alternatively, one can use the preceding theorem: Let us write 

u = ^2 &/0/ an ^ o = ^2 

m i J ) 

where I and J are ascending k- and ^-tuples, respectively, from the set 
{1, . . . , n}. Then 

v AO = ^2^2 bfCj^ A’ipj. 

[/] 1^1 

To write (w A ^)(x) in terms of elementary alternating tensors, we drop all 
terms with repeated indices, rewrite the remaining terms with indices in as- 
cending order, and collect like terms. We see thus that each component of 
U) A 6 is the sum (with signs ±1) of functions of the form bjCj. Thus the 
component functions of u A 9 are of class C r . □ 

Differential forms of order zero 

In what follows, we shall need to deal not only with tensor fields in R n , but 
with scalar fields as well. It is convenient to treat scalar fields as differential 
forms of order 0. 
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Definition. If A is open in R", and if / : A — ► R is a map of class C r , 
then / is called a scalar field in A. We also call / a differential form of 
order 0. 

The sum of two such functions in another such, and so is the product by 
a scalar. We define the wedge product of two 0 -forms / and g by the rule 
/ A <7 = / * g, which is just the usual product of real- valued functions. More 
generally, we define the wedge product of the 0-form / and the fc-form u by 
the rule 

(w A /)(x) = (/ A u>)(x) = /(x) • w(x); 

this is just the usual product of the tensor u;(x) and the scalar /(x). 

Note that all the formal algebraic properties of the wedge product hold. 
Associativity, homogeneity, and distributivity are immediate; and anticom- 
mutativity holds because scalar fields are forms of order 0: 

/ A 5 =(- 1 )°£A/ and / A U = (-l)°u> A /. 

Convention. Henceforth, we shall use Roman letters such as /, g , h 
to denote 0 -forms, and Greek letters such as lj, 77 , $ to denote k-forms 
for k > 0 . 

EXERCISES 

1. Let 7 : R — + R n be of class C r . Show that the velocity vector of 7 
corresponding to the parameter value t is the vector 7*(^;ei). 

2. If A is open in R* and oc : A —* R n is of class C r , show that cr*(x; v) is the 
velocity vector of the curve 7(t) = a(x-F<v) corresponding to parameter 
value t = 0. 

3 . Let M be a fc-manifold of class C r in R n . Let p E M . Show that the 
tangent space to M at p is well-defined, independent of the choice of the 
coordinate patch. 

4 . Let M be a fc-manifold in R n of class C r . Let p E M — dM. 

(a) Show that if (p;v) is a tangent vector to M, then there is a para- 
metrized-curve 7 : (-€,€) — ► R n whose image set lies in M, such 
that (p; v) equals the velocity vector of 7 corresponding to parameter 
value t — 0. See Figure 29 . 4 . 

(b) Prove the converse. [Hint: Recall that for any coordinate patch Ck , 
the map a; -1 is of class C r . See Theorem 24 . 1 .J 

5 . Let M be a ^-manifold in R n of class C r . Let q E dM. 

(a) Show that if (q;v) is a tangent vector to M at q, then there is a 
parametrized-curve 7 : (-6,c) -+ R n , where 7 carries either (-e, 0 ] 
or [0, e) into M, such that (q; v) equals the velocity vector of 7 
corresponding to parameter value t — 0. 
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Figure 29.4 


(b) Prove the converse. 


§30. THE DIFFERENTIAL OPERATOR 


We now introduce a certain operator d on differential forms. In general, the 
operator d , when applied to a k- form, gives a k + 1 form. We begin by defining 
d for 0-forms. 

The differential of a 0-form 

A 0-form on an open set A of R” is a function / : A — ► R. The differential 
df of / is to be a 1-form on A, that is, a linear transformation of 7^(R n ) into 
R, for each x 6 A. We studied such a linear transformation in Chapter 2. We 
called it the “derivative of / at x with respect to the vector v.” We now look 
at this notion as defining a 1-form on A. 

Definition. Let A be open in R"; let / : A — ► R be a function of 
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class C r . We define a 1-form df on A by the formula 

d/(x)(x; v) = /'(x; v) = Df(x) • v. 

The 1-form df is called the differential of /. It is of class C r ~ l as a function 
of x and v. 

Theorem 30.1. The operator d is linear on 0 -forms. 

Proof Let f,g : A — ► R be of class C r . Let h = a f - 1- bg. Then 
Dh(x) = a Df(x) -f b Dg(x ), 


so that 

dh(x)(x\ v) = a df (x)(x; v) -f b dg{x)(x\ v). 

Thus dh = a(df) + b(dg), as desired. □ 

Using the operator d , we can obtain a new way of expressing the elemen- 
tary 1-forms <pi in R": 

Lemma 30.2. Let (f) i, . . . , <j) n be the elementary 1 -forms in R n . 
Let 7T t * : R n — *■ R be the i th projection function, defined by the equation 

( *^ 1 5 • • • i *^n) — •t'i* 


Then diTi = </>,-. 

Proof. Since 7r, is a C°° function, dxi is a 1-form of class C°° . We 
compute 

d7r,(x)(x; v) = jDt, (x) • v 


= [0 • . • 0 1 0 • • • 0 ] 


Vl 


Lv„J 


= Vi 


Thus diTi = (hi- n 


Now it is common in this subject to abuse notation slightly, denoting the 
2 th projection function not by x,- but by a?,-. Then in this notation, <pi is equal 
to dxi. We shall use this notation henceforth: 

Convention. If x denotes the general point of R n , we denote the 2 th 
projection function mapping R" to R by the symbol Xi. Then dx i equals 
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the elementary 1 -form fa in R n . If I — (*i, is an ascending 

k-tuple from the set {1, n}, then we introduce the notation 


dxi — dxi t A • • • A dxi k 


for the elementary k-form V>/ in R n . The general k-form can then be 
written uniquely in the form 

u = ^ bj dxf, 

[/] 


for some scalar functions &/. 

The forms dxi and dxj are of course characterized by the equations 
dxi(x)(x\ v) = v it 

cfo/(x)((x; vj, . (x;v fc )) = detX/, 


where X is the matrix X — [vj • • ■ v*]. 

For convenience, we extend this notation to an arbitrary &-tuple J = 
(j i, . . . , jk) from the set {1, . . . , rc}, setting 

dxj = dxjt A • • • A dxj k . 

Note that whereas dxj is the differential of a 0-form, dxj does not denote the 
differential of a form, but rather a wedge product of elementary 1-forms. 

REMARK. Why do we call the use of Xi for an abuse of notation? The 
reason is this: Normally, we use a single letter such as / to denote a function, 
and we use the symbol f(x) to denote the value of the function at the point x. 
That is, / stands for the rule defining the function, and f(x) denotes an 
element of the range of /. It is an abuse of notation to confuse the function 
with the value of the function. 

However, this abuse is fairly common. We often speak of “the function 
x 3 -f 2x -f 1” when we should instead speak of “the function / defined by the 
equation f(x) = x 3 + 2x -j- 1,” and we speak of “the function e xv when we 
should speak of “the exponential function.” 

We are doing the same thing here. The value of the i th projection function 
at the point x is the number X{\ we abuse notation when we use Xi to denote 
the function itself. This usage is standard, however, and we shall conform 
to it. 

If / is a 0-form, then df is a 1-form, so it can be expressed as a linear 
combination of elementary 1-forms. The expression is a familiar one: 
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Theorem 30.3. Let A be open in R"; let f : A — ► R be of class C r . 
Then 

df = (Dif)dxi + • • • + (D n f)dx n . 

In particular, df = 0 iff is a constant function. 

In Leibnitz’s notation, this equation takes the form 

,, df 1 df 1 

df ~ dxi dXl + "' + dxJ Xn ' 

This formula sometimes appears in calculus books, but its meaning is not 
explained there. 

Proof. We evaluate both sides of the equation on the tangent vector 
(x;v). We have 

d/(x)(x;v) = Df(x) • v 

by definition, whereas 

J2 A/(x)flfej(x)(x;v) = jr Dif(x)Vi. 

i=i i=i 

The theorem follows. □ 

The fact that df is only of class C r ~ l if / is of class C r is very inconve- 
nient. It means that we must keep track of how many degrees of differentia- 
bility are needed in any given argument. In order to avoid these difficulties, 
we make the following convention: 

Convention. Henceforth, we restrict ourselves to manifolds, maps, 
vector fields, and forms that are of class C°°. 

The differential of a k-form 

We now define the differential operator d in general. It is in some sense 
a generalized directional derivative. A formula that makes this fact explicit 
appears in the exercises. Rather than using this formula to define d , we shall 
instead characterize d by its formal properties, as given in the theorem that 
follows. 


Definition. If A is an open set in R n , let Q*(A) denote the set of all 
fc-forms on A (of class C°°). The sum of two such ft-forms is another fc-form, 
and so is the product of a k-form by a scalar. It is easy to see that Q k (A) 
satisfies the axioms for a vector space; we call it the linear space of k-forms 
on A. 
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Theorem 30.4. Let A be an open set in R n . There exists a unique 
linear transformation 

d:Q k (A)^Q k+1 (A), 

defined for k > 0, such that: 

(1) If f is a 0 -form, then df is the \~form 

df (x)(x; v) = Df{x) • v. 

(2) If u and r] are forms of orders k and t, respectively, then 

d{u A rj) = du A if + (-l) fc w A dr]. 

(3) For every form u, 

d(du) = 0. 


We call d the differential operator, and we call du the differential 
of u. 

Proof. Step 1. We verify uniqueness. First, we show that condi- 
tions (2) and (3) imply that for any forms U\, . . . , u k , we have 


d(du\ A • • • A duk) = 0. 


If k — 1, this equation is a consequence of (3). Supposing it true for k — 1, 
we set rj = ( du 2 A • • • A du k ) and use (2) to compute 

d(du\ A 77) = d(dui) A 77 ± du\ A dr]. 

The first term vanishes by (3) and the second vanishes by the induction hy- 
pothesis. 

Now we show that for any k- form U, the form du is entirely determined 
by the value of d on 0-forms, which is specified by (1). Since d is linear, it 
suffices to consider the case u — f dxj. We compute 

du = d(f dxi) — d(f A dxj) 

= df A dxj + / A d(dxi) by (2), 

= df A dxj , 

by the result just proved. Thus du is determined by the value of d on the 
0-form /. 
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Step 2. We now define d. Its value for 0-forms is specified by (1). The 
computation just made tells us how to define it for forms of positive order: 
If A is an open set in R n and if u is a fc-form on A , we write u uniquely in 
the form 

^ fi dx ^ 

[/] 

and define 

du = ^2 dfi A dx r . 

[/] 

We check that du is of class C°° . For this purpose, we first compute 


du> = '£['£( D if) d x j ]A d x l . 
[I] J = 1 


To express du as a linear combination of elementary k+ 1 forms, one proceeds 
as follows: First, delete all terms for which j is the same as one of the indices 
in the fc-tuple I. Second, take the remaining terms and rearrange the dxi so 
the indices are in ascending order. Third, collect like terms. One sees in this 
way that each component of du is a linear combination of the functions Dj f, 
so that it is of class C°° . Thus du is of class C°°. (Note that if u were only 
of class C\ then du would be of class C r_1 .) 

We show d is linear on fc-forms with k > 0. Let 

v = ^2 fr dxf and r ] = ^2 Ql dx r 

U] U] 

be forms. Then 

d(au + bp) — d ( afj -f bgi)dxi 
[i] 

= ^2 d( a fi + bgj) A dxf by definition, 

[/] 

= ^ 2 ( a dfi + b dgi ) A dxj since d is linear on 0-forms, 

[/] 

= a du b dg. 

Step 3. We now show that if J is an arbitrary fc-tuple of integers from 
the set {1, . . . , n}, then 


d(f A dxj) = df A dxj. 
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Certainly this formula holds if two of the indices in J are the same, since 
dxj — 0 in this case. So suppose the indices in J are distinct. Let 7 be 
the fc-tuple obtained by rearranging the indices in J in ascending order; let 
x be the permutation involved. Anticommutativity of the wedge product 
implies that dxj = (sgn T)dxj. Because d is linear and the wedge product is 
homogeneous, the formula d(f A dxi) = df A dxi, which holds by definition, 
implies that 

(sgn 7r) d(f A dxj) = (sgn x) df A dxj. 

Our desired result follows. 

Step 4 • We verify property (2), in the case k = 0 and 7 = 0. We 
compute 

n 

d(f A g) = D Af -9)dxj 

j= 1 

= J2 ( A7) 9dxj + J2f- (■ D i 9) dx i 

j=i j=i 

= (4f)Ag + f A (dg). 

Step 5. We verify property (2) in general. First, we consider the case 
where both forms have positive order. Since both sides of our desired equation 
are linear in w and in 77, it suffices to consider the case 


uj—fdxj and rj — gdxj. 


We compute 

d(u A?/) — d(fg dxi A dxj ) 

— d(fg) Adxf Adxj by Step 3, 

= ( df A g + / A dg) A c/a?/ A dxj by Step 4, 

= (df A dxrf A (g A dxj) + (— l) fc (/ A dx /) A (dg A dxj) 
= du A rj -f (-l)*w A dg. 


The sign (— 1)* comes from the fact that dxj is a A;-form and dg is a 1-form. 

Finally, the proof in the case where one of k or i is zero proceeds as in the 
argument just given. If k = 0, the term dxj is missing from the equations, 
while if i — 0, the term dxj is missing. We leave the details to you. 
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Step 6. We show that if / is a 0-form, then d(df ) = 0. We have 

n 

d(df) = d ^2 Dj f dxj , 
j= i 

n 

= ^2 d(Djf) A dxj by definition, 
i= i 

n n 

= ^ ^ DiDj f dxi A dxj . 

j — 1 i = l 

To write this expression in standard form, we delete all terms for which i — j, 
and collect the remaining terms as follows: 

d(df) = ^ 2 (A^j7 - DjDif)dxi A dxj. 

i<3 

The equality of the mixed partial derivatives implies that d(df) = 0. 

Step 7. We show that if u; is a fc-form with k > 0, then d(du) = 0. 
Since d is linear, it suffices to consider the case lj = f dxj. Then 

d(du) — d(df A dxj) 

— d(df) A dxi - df A d(dxj ), 

by property (2). Now = 0 by Step 6, and 

d(dxj) = c?(l) A dxi — 0 

by definition. Hence d(du) = 0. □ 


Definition. Let A be an open set in R”. A 0-form / on A is said to be 
exact on A if it is constant on A\ a fc-form cj on A with k > 0 is said to be 
exact on A if there is a k — 1 form 6 on A such that uj — d6. A fc-form lj on 
A with k > 0 is said to be closed if du = 0. 

Every exact form is closed; for if / is constant, then df = 0, while if 
lj = d6, then du = d(d$) = 0. Conversely, every closed form on A is exact 
on A if A equals all of R n , or more generally, if A is a “star-convex” subset 
of R". (See Chapter 8.) But the converse does not holds in general, as we 
shall see. If every closed fc-form on A is exact on A, then we say that A is 
homologically trivial in dimension k. We shall explore this notion further 
in Chapter 8. 



260 Differential Forms 


Chapter 6 


EXAMPLE 1. Let A be the open set in R 2 consisting of all points (x,y) for 
which x ^ 0. Set /(x, y) = x/ \x\ for (x,y) E A. Then / is of class C°° on A, 
and df = 0 on A. But / is not exact on A because / is not constant on A. 

EXAMPLE 2. Exactness is a notion you have seen before. In differential equa- 
tions, for example, the equation 

P(x, y) dx -f Q{x , y) dy = 0 

is said to be exact if there is a function / such that P = df /Ox and Q = 
dfjdy. In our terminology, this means simply that the 1-form P dx + Q dy 
is the differential of the 0-form /, so that it is exact. 

Exactness is also related to the notion of conservative vector fields. In 
R 3 , for example, the vector field 

F = Pi - F Qj + Rk 


is said to be conservative if it is the gradient of a scalar field /, that is, if 

p = df/dx and Q = dfjdy and R^df/dz. 

This is precisely the same as saying that the form P dx + Q dy -f Rdz is the 
differential of the 0-form /. 

We shall explore further the connection between forms and vector fields 
in the next section. 


EXERCISES 

1. Let A be open in R n . 

(a) Show that fl fc (A) is a vector space. 

(b) Show that the set of all C°° vector fields on A is a vector space. 

2. Consider the forms 

lj = xy dx + 3 dy — yz dz , 

77 = x dx - yz 2 dy + 2x dz , 

in R 3 . Verify by direct computation that 

d(du>) — 0 and d(u> A Tj) — (du>) A 77 — u> A d 77. 

3. Let uj be a fc-form defined in an open set A of R n . We say that to vanishes 
at x if u;(x) is the 0- tensor. 

(a) Show that if u> vanishes at each x in a neighborhood of Xo, then du> 
vanishes at xq. 
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(b) Give an example to show that if u> vanishes at Xo, then du> need not 
vanish at Xq. 

4. Let A — R — 0; consider the 1-form in A defined the equation 
a ) = (x dx + y dy)/(x 2 + y 2 ). 

(a) Show u> is closed. 

(b) Show that u is exact on A. 

*5. Prove the following: 

Theorem. Let A — R 2 - 0; let 

u) = (- y dx + x dy)/(x 2 + y 2 ) 

in A. Then u> is closed, but not exact, in A. 

Proof, (a) Show w is closed. 

(b) Let B consist of R 2 with the non-negative z-axis deleted. Show that 
for each ( x,y ) £ B, there is a unique t with 0 < t < 2ft such that 

x ~ (x 2 -f y 2 ) 1/2 cost and y = (x 2 + y 2 ) 1/2 sin t\ 
denote this value of t by <f>(x,y). 

(c) Show that <t> is of class C°°. [Hint: The inverse sine and inverse 
cosine functions are of class C°° on the intervals (-7 t/ 2,7 t/ 2) and 
(0,7r), respectively.] 

(d) Show that u = d(f> in B. [Hint: We have tan <f) — y/x if x ^ 0 and 
cot<£= x/y if y / 0.] 

(e) Show that if g is a closed 0-form in B, then g is constant in B. [Hint: 
Use the mean-value theorem to show that if a is the point (—1,0) of 
R 2 , then #(x) = g( a) for all x £ B.] 

(f) Show that uj is not exact in A. [Hint: If uj = df in A, then f - <f> 
is constant in B. Evaluate the limit of /(1,2/) as y approaches 0 
through positive and negative values.] 

6. Let A R 0. Let m be a fixed positive integer. Consider the following 
n — 1 form in A: 

n 

V — ^ dx i A • • • A dxi A • ■ ■ A dx n , 

«=l 

where fi(x) — a?i/ ||x|| m , and where dxi means that the factor dxi is to 
be omitted. 

(a) Calculate dg. 

(b) For what values of m is it true that drj = 0? (We show later that T] 
is not exact.) 
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*7. Prove the following, which expresses das a generalized “directional deriva- 
tive”: 

Theorem. Let A be open in R n ; let u> be a k-1 form in A. Given 
vi, . .., v fc G R n , define 

h(x) - do;(x)((x;v 1 ), (x;v fc )), 

9j (x) = w(x)((x;v 1 ), (xjv^), (x; v fc )) , 

where a means that the component a is to be omitted. Then 

k 

h(x) = 5^(-l ) J_1 Dgj(x)-Vj. 

Proof, (a) Let X = [vi ■ • • v*]. For each j, let Y 3 = [vi • • • Vj ■ ■ * v*]. 
Given (i, i'i , . . . , ik- i ), show that 

k 

det X(i, i\ , . . . , ik— i ) = ^ ^ ("1) J t>»i det Vj (*i > • • • j t'fe — i )• 

(b) Verify the theorem in the case u = / dzj. 

(c) Complete the proof. 


*§31. APPLICATION TO VECTOR AND SCALAR FIELDS 


Finally, it is time to show that what we have been doing with tensor fields and 
forms and the differential operator is a true generalization to R” of familiar 
facts about vector analysis in R 3 . We will use these results in §38, when we 
prove the classical versions of Stokes’ theorem and the divergence theorem. 

We know that if A is an open set in R n , then the set of fc-forms on A 

is a linear space. It is also easy to check that the set of all C°° vector fields 
on A is a linear space. We define here a sequence of linear transformations 
from scalar fields and vector fields to forms. These transformations act 
as operators that “translate” theorems written in the language of scalar and 
vector fields to theorems written in the language of forms, and conversely. 

We begin by defining the gradient and the divergence operators in R n . 
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Definition. Let A be open in R”. Let / : A — ► R be a scalar field in A. 
We define a corresponding vector field in A , called the gradient of /, by the 
equation 

(grad /)(x) = (x; Dif(x)e l + • • ■ + £>„/(x)e n ). 

If G(x) = (x; <?(x)) is a vector field in A, where g : A — ► R n is given by the 
equation 

g(x) = 0i(x)ej + • • • + flf„(x)e n , 

then we define a corresponding scalar field in A, called the divergence of G, 
by the equation 


(div G)(x) = £>i0i (x) + •■■ + D n g n {x). 


These operators are of course familiar from calculus in the case n — 3. The 
following theorem shows how these operators correspond to the operator d: 

Theorem 31.1. Let A be an open set in R n . There exist vector 
space isomorphisms a, and (3j as in the following diagram: 


Scalar fields in A 

orp 

n a (A) 

|grad 



Vector fields in A 

«i 

n‘(v4) 

Vector fields in A 

Pn~l 

n n - 1 (^) 

l div 


[ d 

Scalar fields in A 

fin [ 

Q"(A) 


such that 


d o a 0 = e*i o grad and do fi n _ x - o div. 


Proof. Let / and h be scalar fields in A ; let 

F(x) = (x; /*( x ) e *) and G(x) = (x; ^ 0*( x ) e *) 
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be vector fields in A. We define the transformations and (3j by the equa- 
tions 

<*0/ = /, 

n 

ot\F = ^ ^ fidxi, 

i= 1 
n 

f3 n -iG = C “l) l_1 0i ^1 A • • • A dXi A • • • A dx n , 

i = l 

/3 n h = h dx 1 A • • • A c/x„. 

(As usual, the notation <2 means that the factor <2 is to be omitted.) The fact 
that each a, and f3j is a linear isomorphism, and that the two equations hold, 
is left as an exercise. □ 

This theorem is all that can be said about applications to vector fields in 
general. However, in the case of R 3 , we have a “curl” operator, and something 
more can be said. 

Definition. Let A be open in R 3 ; let 

F(x) = (x; /«( x ) e ») 

be a vector field in A. We define another vector field in A, called the curl 
of F, by the equation 

(curl F)(x) = (x; (D 2/3 — Dzfi) G i + (^ 3/1 — T)if 3 )e 2 + (^ 1/2 — ^2/l) e 3)- 

A convenient trick for remembering the definition of the curl operator is 
to think of it as obtained by evaluation of the symbolic determinant 

- e x e 2 e 3 " 
det D\ D 2 D 3 

- fi h h - 

For R 3 , we have the following strengthened version of the preceding the- 
orem: 

Theorem 31.2. Let A be an open set in R 3 . There exist vector 
space isomorphisms a* and flj as in the following diagram: 
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Scalar fields in A 

j grad 

Vector fields in A 

j^curl 

Vector fields in A 

i div 

Scalar fields in A 


«0 


ori 


02 


Q°(A) 

i' 

Q^A) 

Q 2 (A) 


0 3 


Q 3 (A) 


such that 

d o ct 0 = ai o grad and 


d o <*i = /3i o curl and 


do = fto div. 


Proof. The maps a,- and fij are those defined in the proof of the pre- 
ceding theorem. Only the second equation needs checking; we leave it to 
you. □ 


EXERCISES 


1 . Prove Theorems 31.1 and 31.2. 

2 . Note that in the case n — 2 , Theorem 31.1 gives us two maps c*i and /3j 
from vector fields to 1 -forms. Compare them. 

3. Let A be an open set in R 3 . 

(a) Translate the equation d(doj) = 0 into two theorems about vector 
and scalar fields in R 3 . 

(b) Translate the condition that A is homologically trivial in dimension k 
into a statement about vector and scalar fields in A. Consider the 
cases k — 0, 1, 2. 

4. For R 4 , there is a way of translating theorems about forms into more 
familiar language, if one allows oneself to use matrix fields as well as vector 
fields and scalar fields. We outline it here. The complications involved 
may help you understand why the language of forms was invented to deal 
with R n in general. 

A square matrix B is said to be skew-symmetric if B tr — —B. 
Let A be an open set in R 4 . Let S(A) be the set of all C°° functions H 
mapping A into the set of 4 by 4 skew-symmetric matrices. If hi } (x) 
denotes the entry of H(x) in row i and column j, define 72 : S(A) 
f 2 2 (A) by the equation 

72 (H) = hij(x)dxi A dxj. 

i<3 
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(a) Show 72 is a linear isomorphism. 

(b) Let cto » (*1 , fiz , 0* be defined as in Theorem 31.1. Define operators 
“twist” and “spin” as in the following diagram: 


such that 


Vector fields in A 


S{A) 

l^apin 

Vector fields in A 


73 


Pz 


n 1 ^) 


I- 

tf{A) 


1 - 

fi 3 (A) 


d o ai = 72 o twist and d o 72 — 03 0 spin. 

(These operators are facetious analogues in R 4 of the operator “curl 
in R 3 .) 

5. The operators grad, curl, and div, and the translation operators a» and 
pj, seem to depend on the choice of a basis in R n , since the formulas 
defining them involve the components of the vectors involved relative to 
the basis ei , • • • , e„ in R n . However, they in fact depend only on the 
inner product in R" and the notion of right-handedness, as the following 
exercise shows. 

Recall that the fc-volume function V(xi, ..., x*) depends only on 
the inner product in R n . (See the exercises of §21.) 

(a) Let F(x) — (x; /(x)) be a vector field defined in an open set of R n . 

Show that Q!i F is the unique 1-form such that 


cti F(x)(x;v) = (/(x),v). 

(b) Let G(x) = (x;g(x)) be a vector field defined in an open set of R n . 
Show that (3n-iG is the unique n - 1 form such that 

0 „_,G(x)((x;v 1 ), (x;v„_,)) = £ • V(®(x), Vi, . . . , v„_i), 

where C — -f 1 if the frame (g(x),v 1, , v n -i) is right-handed, and 

6 = — 1 otherwise. 

(c) Let h be a scalar field defined in an open set of R n . Show that finh 
is the unique 7i-form such that 

P n h(x)((x; vj, ... , (x; vj) = € • h(x) • V(vi, ... , v„), 
where c = +1 if (vi , . . . , v n ) is right-handed, and e = -1 otherwise. 

(d) Conclude that the operators grad and div (and curl if n = 3) depend 
only on the inner product in R n and the notion of right-handedness 
in R n . [Hint: The operator d depends only on the vector space 
structure of R n .] 
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§32. THE ACTION OF A DIFFERENTIABLE MAP 


If a : A ► R n is a 0°° map, where A is open in R fc , then a gives rise 
to a linear transformation a* mapping the tangent space to R* at x into 
the tangent space to R n at cr(x). Furthermore, we know that any linear 
transformation T : V — ► W of vector spaces gives rise to a dual transformation 
T* : A l (W) -* A l {V) of alternating tensors. We combine these two facts to 
show how a C°° map a gives rise to a dual transformation of forms, which 
we denote by a*. The transformation a* preserves all the structure we have 
imposed on the space of forms — the vector space structure, the wedge product, 
and the differential operator. 

Definition. Let A be open in R fc ; let a : A -*■ R n be of class C °° ; let B 
be an open set of R n containing a(A). We define a dual transformation of 
forms 

a* : n*(B) - &(A) 

as follows: Given a 0-form f : B — ► R on B, we define a 0-form a* f on A by 
setting (a* f)(x) = f(a(x)) for each x 6 A. Then, given an f-form u on B 
with i > 0, we define an ^-form ct*u on A by the equation 

(«* W )(x)((x i v 1 ),...,(x;v < ))= W (a(x)) (a.(x; Vl ), . . . , a„(x; v,)). 


Since / and u> and a and Da are all of class C°°, so are the forms a* f 
and a u. Note that if / and u and a were of class C r , then a* f would be 
of class C r but a*u would only be of class C r ~ l . Here again it is convenient 
to have restricted ourselves to C°° maps. 

Note that if a is a constant map, then a* f is also constant, and a*w is 
the 0-tensor. 

The relation between cr* and the dual of the linear transformation a* is 
the following: Given a : A -> R" of class C°° , with a(x) = y, it induces the 
linear transformation 


T = a. : T x (R l ) -> T y (R n ); 

this transformation in turn gives rise to a dual transformation of aiternating 
tensors, 

r* :^(T y (R"))^^(r x (R 1 )). 

If u; is an f-form on B , then w(y) is an alternating tensor on Ty(R n ), so that 
^ ' My)) is an alternating tensor on T^{R k ). It satisfies the equation 

T* (w(y)) = (a*w)(x); 
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for 


T*(w( y)) ((x; vj (x;vj) =u(a(x)) (a,(x; vj, . . . , a*(x; v t )) 

~ (a*u>)(x) ((xjvj), (x;v*)). 


This fact enables us to rewrite earlier results concerning the dual transforma- 
tion T * as results about forms: 


Theorem 32.1. Let A be open in R* ; let a : A -*■ R m be a C°° 
map. Let B be open in R m and contain a(A); let 0 : B — ► R” be a C°° 
map. Let LJ,rj,0 be forms defined in an open set C of R n containing 
0(B); assume u> and rj have the same order. The transformations a* 
and 0* have the following properties: 

(1) 0*(au 4- brj) = + b(0*rj). 

(2) 0*(lj A0) = 0*u/\0*6. 

(3) (j3oa) $ « = a*(M. 


Proof. See Figure 32.1. In the case of forms of positive order, proper- 
ties (1) and (3) are merely restatements, in the language of forms, of Theo- 
rem 26.5, and (2) is a restatement of (6) of Theorem 28.1. 

Checking the properties when some or all of the forms have order zero is 
a computation we leave to you. □ 



Figure 32.1 
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This theorem shows that o* preserves the vector space structure and the 
wedge product. We now show it preserves the operator d. For this purpose 
(and later purposes as well), we obtain a formula for computing o*u. If A is 
open in R k and o : A — * R", we derive this formula in two cases — -when to is 
a 1-form and when cj is a fc-form. This is all we shall need. The general case 
is treated in the exercises. 

Since O'* is linear and preserves wedge products, and since Ct* f equals 
/ o o, it remains only to compute o* for elementary 1-forms and elementary 
A;-forms. Here is the required formula: 

Theorem 32.2. Let A be open in R fc ; let o : A — ► R" be a C°° 
map. Let x denote the general point of R fc ; let y denote the general 
point of R". Then dxi and dyi denote the elementary 1 -forms in R fc 
and R n , respectively. 

(a) a*(dy { ) = da { . 

(b) If I = (t’i, . . . , if.) is an ascending k-tuple from the set {1, . . . , n], 
then 

ot*(dyi) — (det ~^r) dx i A • • • A dx k , 

where 

9<*i = d(a ilt ...» a ik ) 

Ok OlyX\, . . . , Xk) 

Proof, (a) Set y = o(x). We compute the value of a*(dyi) on a typical 
tangent vector as follows: 

( a ‘(<%))(x)(x;v) = dyd y)(o*(x;v)) 

= i th component of ( Da(x ) • v) 
k 

= D J Q i( X ) ' V 3 

j = 1 

* . 

= ^r( x )^i( x )( x ; v )* 

j= i OXj 

It follows that 

k rim- 

i=i U1 > 

By Theorem 30.3, the latter expression equals do,-. 
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(b) The form a*(dyj) is a fc-form defined in an open set of R fc , so it has 
the form 

a*(dyi) = hdxi A • • • A dx k 

for some scalar function h. If we evaluate the right side of this equation on 
the &-tuple (xjej), (x;e fc ), we obtain the function h(x). The theorem 
then follows from the following computation: 

h(x)= (a m (dy f ))(x)((x ; e x ), (x;e fc )) 

= dyH y)( a *( x ; e i)> • ••» <M x ; e *)) 

= dy I (y)((y;da/dx 1 ), (y \dafdx k )) 


= det[Z)a(x)]/ 



□ 


It is easy to remember the formula (a); to compute (X*(dyi ), one simply 
takes the form dyi and makes the substitution yi = c*t(x)! 

Note that one could compute Q*(dyj) by the formula 


a*(dyi) = <**{dy ix ) A-** Aa*(dy ik ) 

= dai l A • • • A da ik , 

but the computation of this wedge product is laborious if k > 2. 

Theorem 32.3. Let A be open in R*; let a : A — ► R n be of 
class C°° . If u) is an t-jorm defined in an open set of R" containing 
ot{A), then 

a*(duj) = d(a*u>). 


Proof ’ Let x denote the general point of R fc ; let y denote the general 

point of R”. 

Step 1. We verify the theorem first for a 0-form /. We compute the left 
side of the equation as follows: 


(*) 


n 

a'(df) = a'C£( D ‘f'> d y<'> 

i—1 

n 

= 23 ((A/) 0 a) da,. 

i = 1 
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Then we compute the right side of the equation. We have 

(**) d(a*f) = d(f oa) 

k 

= E Dj(f o a) dxj. 

;=i 

We now apply the chain rule. Setting y = a(x), we have 

D(f o a)(x) = Df (y) • Da(x); 
since D(f o a) and Df are row matrices, it follows that 

Dj(J o a)(x) = Df( y) • ( j th column of Da(x)) 

Tl 

= D if(y) -DjOLiix). 

»=1 

Thus 

r» 

Dj(f o a) = (( Dif ) o a) • DjCXi. 

i - 1 

Substituting this result in the equation (**), we have 

k n 

(* * *) d (a*f) = ° a ) ‘ D i a * dx i 

j - i »=i 

= E ° <*) dot. 

1 = 1 

Comparing (*) and (* * *), we see that a*(df) = d(a*f). 

Step 2. We prove the theorem for forms of positive order. Since a* and d 
are linear, it suffices to treat the case = f dyi, where I = (i 1} . . . , if) is an 
ascending ^-tuple from the set {1, . . . , n}. We first compute 

(t) a*(dLo) — cx*(df A dyj) 

= a*(df) A a*(dyi). 


On the other hand, 

(ft) d(a*uj) = d[a*(f A dyj)] 

= d[(a*f) A a*(dy/)\ 

= d(a'f)Aa*(dy,) + (a*f)AO , 
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since 

d(a*(dyj)) = d(da il A • • * A da it ) = 0. 

The theorem follows by comparing (f) and (ft) and using the result of 
Step 1. □ 

We now have the algebra of differential forms at our disposal, along with 
differential operator d. The basic properties of this algebra and the operator d, 
as summarized in this section and §30, are all we shall need in the sequel. 

It is at this point, where one is dealing with the action of a differentiable 
map, that one begins to see that forms are in some sense more natural objects 
to deal with than are vector fields. A C°° map a : A —> ► R n , where A is open 
in R fc , gives rise to a linear transformation a* on tangent vectors. But there is 
no way to obtain from a a transformation that carries a vector field on A to a 
vector field on cx{A). Suppose for instance that F(x) = (x;/(x)) is a vector 
field in A. If y is a point of the set B - a(A) such that y = a(x 2 ) = a(x 2 ) for 
two distinct points xi,x 2 of A, then a* gives rise to two (possibly different) 
tangent vectors a* (xi; /(xi)) and a*(x 2 ;/(x 2 )) at y! See Figure 32.2. 



Figure 32.2 


This problem does not occur if a : A —■ ► B is a diffeomorphism. In this 
case, one can obtain an induced map a* on vector fields. One assigns to the 
vector field F on A , the vector field G = a*F on B defined by the equation 

G(y) = a.(F(a- 1 ( y))). 

A scalar field h on A gives rise to a scalar field k = a+h on B defined by the 
equation k = hoa~ l . The map a* is not however very natural, for it does not 
in general commute with the operators grad, curl, and div of vector calculus, 
nor with the “translation” operators a,- and (3j of §31. See the exercises. 
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EXERCISES 

1. Prove Theorem 32.1 when uj and rj have order zero and when 6 has order 
zero. 

2. Let oc : R 3 — ► R 6 be a C°° map. Show directly that 

dct i A da$ A docs = (det £>a(l, 3, 5)) dx i A dx 2 A dx 3. 

3. In R 3 , let 

w = xy dx + 2z dy - ydz. 

Let oc : R 2 — ► R 3 be given by the equation 

Oc(u, w) = ( uv , u 2 , 3ii + t;). 

Calculate du and a*uj and oc*(duj) and d(oc*u;) directly. 

4. Show that (a) of Theorem 32.2 is equivalent to the formula oc*[dy^ = 
d(a*yi), where : R n — ► R is the i th projection function in R n . 

5. Prove the following formula for computing oc*uj in general: 

Theorem. Let A be open in R*; let a : A -* R n be of class C°°. Let-x. 
denote the general point of R fc ; let y denote the general point of R n . 
If I — (t’i, is an ascending l-tuple from, the set {1, ..., n), 

then 

at*(dyi) ~ ^ (det dxj. 

(-d Xj 

Here J = (ji , . . . , j t ) is an ascending /-tuple from the set {l, . . . , fc} 
and 

daj_ _ t a it ) 

dxj d(x 3l ,...,x Jt )' 

*6. This exercise shows that the transformations and of §31 do not 
in general behave well with respect to the maps induced by a diffeomor- 
phism Oi. 

Let a : A — ► B be a diffeomorphism of open sets in R". Let x denote 
the general point of A, and let y denote the general point of B. If 
.F(x) = (x;/(x)) is a vector field in A, let G(y ) = cr*( J F(Q' _1 (y))) be 
the corresponding vector field in B. 

(a) Show that the 1-forms CkiG and oc\F do not in general correspond 
under the map oc*. Specifically, show that Oc*{oc\G) = oc\F for all F 
if and only if Da{x) is an orthogonal matrix for each X. [Hint: Show 
the equation a*(aiG) = oc\F is equivalent to the equation 

Da(x)' T ■ Da(x) ■ f(x) = f(x).] 

(b) Show that oc* (p n ~\G) = p n —\F for all F if and only if det Doc = +1. 
[Hint: Show the equation oc*(p n -\G) = P n —\F is equivalent to the 
equation /(x) = (det Pa(x)) • /(x).] 
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(c) If h is a scalar field in A, let k = h o a -1 be the corresponding 
scalar field in B. Show that a*(/? n fc) = (3 n h for all h if and only if 
det Da = +1. 

7. Use Exercise 6 to show that if a is an orientation-preserving isometry of 
R n , then the operator a* on vector fields and scalar fields commutes with 
the operators grad and div, and with curl if n — 3. (Compare Exercise 5 
of §31.) 




Stokes’ Theorem 


We saw in the last chapter how fc-forms provide a generalization to R n of the 
notions of scalar and vector fields in R 3 , and how the differential operator d 
provides a generalization of the operators grad, curl, and div. Now we define 
the integral of a A;-form over a fc-manifold; this concept provides a generaliza- 
tion to R" of the notions of line and surface integrals in R 3 . Just as line and 
surface integrals are involved in the statements of the classical Stokes’ theorem 
and divergence theorem in R 3 , so are integrals of A:-forms over ^-manifolds 
involved in the generalized version of these theorems. 

We recall here our convention that all manifolds, forms, vector fields, and 
scalar fields are assumed to be of class C°° . 


§33. INTEGRATING FORMS OVER PARAMETRIZED-MANIFOLDS 


In Chapter 5 , we defined the integral of a scalar function / over a manifold, 
with respect to volume. We follow a similar procedure here in defining the 
integral of a form of order k over a manifold of dimension k. We begin with 
parametrized-manifolds. 

First let us consider a special case. 

Definition. Let A be an open set in R*; let T) be a fc-form defined in A. 
Then 77 can be written uniquely in the form 

r/~f dx 1 A • • • A dxk . 
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We define the integral of 77 over A by the equation 



provided the latter integral exists. 

This definition seems to be coordinate-dependent; in order to define f A 77, 
we expressed 77 in terms of the standard elementary 1-forms dXi, which depend 
on the choice of the standard basis ei, in R fc . One can, however, 

formulate the definition in a coordinate-free fashion. Specifically, ifa x , . . . , a* 
is any right-handed orthonormal basis for R fc , then it is an elementary exercise 
to show that 

/ 77= / 77 (x)((x;a 1 ), ..., (x;a k )). 

JA Jx£A 

Thus the integral of 77 does not depend on the choice of basis in R fc , although 
it does depend on the orientation of R*. 

We now define the integral of a fc-form over a parametrized-manifold of 
dimension k. 

Definition. Let A be open in R*; let ol \ A — ► R n be of class ( 7 °°. 
The set Y = O’(A), together with the map a, constitute the parametrized- 
manifold Y a . If u is a k- form defined in an open set of R” containing Y , we 
define the integral of u over Y a by the equation 

/ u= a*cj, 

JY n Ja 

provided the latter integral exists. Since Ol* and f A are linear, so is this 
integral. 

We now show that the integral is invariant under reparametrization, up 
to sign. 

Theorem 33.1. Let g : A — ► B be a diffeomorphism of open sets 
in R*. Assume det Dg does not change sign on A. Let /3 : B — ► R n be 
a map of class C °° ; let Y = P(B). Let ol = (3 o g; then a : A — ► R" and 
Y = a(>l). If u is a k-form defined in an open set of R" containing Y, 
then u is integrable over Yp if and only if it is integrable over Y a ; in 
this case, 
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Theorem 33.2. Let A be open in R fc ; let a : A — ► R" be of 
class C°°; let Y = a(A). Let x denote the general point of A; and 
let z denote the general point of R". If 

u) — f dzj 


is a k-form defined in an open set of R" containing Y, then 

f u — f (/ o a) det(dai/dx). 

Jy„ Ja 


Proof. Applying Theorem 32.2, we have 

a*Lj = (foa) det {daj/d'x) dx i A • • • A dx k . 

The theorem follows. □ 

The notion of a fc-form is a rather abstract one; the notion of its integral 
over a parametrized-manifold is even more abstract. In a later section (§36) 
we discuss a geometric interpretation of fc-forms and of their integrals that 
gives some insight into their intuitive meaning. 


REMARK. We can now make sense of the “dx” notation commonly used in 
single-variable calculus. If f) = / dx is a 1 -form defined in the open interval 
A — (a, b ) of the real line R, then 



/ 


by definition. That is, 

/ fdx= f 'f , 

J A Ja 

where the notation on the left denotes the integral of a. form; and the notation 
on the right denotes the integral of a function! They are equal by definition. 
Thus the “dx” notation used in connection with single integrals in calculus 
makes perfect sense once one has studied differential forms. 

One can also make sense of the notation commonly used in calculus to 
denote a line integral. Given a 1 -form P dx + Q dy -|- R dz, defined in an 
open set A of R 3 , and given a parametrized-curve 7 : (a,b) —> A, one has by 
the preceding theorem the formula 



P dx + Q dy + R dz 


j lP(y(t)) ^ + Q(td)) ^ + R(7(0 ) dt 

J(a , 6 ) 
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where C is the image set of 7. This is just the formula given in calculus 
for evaluating the line integral f c P dx + Q dy + R dz. Thus the notation 
used for line integrals in calculus makes perfect sense once one has studied 
differential forms. 

It is considerably more difficult, however, to make sense of the “dx dy” 
notation commonly used in calculus when dealing with double integrals. If / 
is a continuous bounded function defined on a subset A of R 2 , it is common 
in calculus to denote the integral of / over A by the symbol 



dx dy. 


Here the symbol “dx dy” has no independent meaning, since the only product 
operation we have defined for 1-forms is the wedge product. One justification 
for this notation is that it resembles the notation for the iterated integral. 
And indeed, if A is the interior of a rectangle [a, b] x [c, d\, then we have the 
equation 


/ [ / f{x,y) dx] dy — f /, 
Jc Ja J A 


by the Fubini theorem. Another justification for this notation is that it re- 
sembles the notation for the integral of a 2-form, and one has the equation 


J f dx A dy — J 


by definition. But a difficulty arises when one reverses the roles of x and y. 
For the iterated integral, one has the equation 



f(*,y) dy]dx = 


L 


/, 


and for the integral of a 2-form, one has the equation 


j f dy A dx — — f / ! 

Ja Ja 

Which rule should one follow in dealing with the symbol 



x,y) dy dx ? 


Which ever choice one makes, confusion is likely to occur. For this reason, the 
“dx dy” notation is one we shall not use. 

One could, however, use the “d.V” notation introduced in Chapter 5 
without ambiguity. If A is open in R*, then A can be considered to be a 
parametrized-manifold that is parametrized by the identity map & : A — ► A\ 
Then 

f fdV=[ (foa)V(D(a))= f f, 

J A a Ja Ja 

since D(a) is the identity matrix. Of course, the symbol d used here bears 
no relation to the differential operator d. 
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EXERCISES 

1. Let A — (0, l) 2 . Let a : A — ► R 3 be given by the equation 

a(u,t;) = ( u,v,u 2 + v 2 + 1). 

Let Y be the image set of a. Evaluate the integral over Y a of the 2-form 
x 2 dx 2 A dx 3 + X1X3 dx 1 A dx 3. 

2. Let A = (0, l) 3 . Let a : A — • R 4 be given by the equation 

= (s,u,t,{2u -t) 2 ). 

Let Y be the image set of a. Evaluate the integral over Y a of the 3-form 
X\ dx 1 A dxi A dx 3 + 2x2^3 dx 1 A dx 2 A ^3. 

3. (a) Let A be the open unit ball in R 2 . Let a- : A — R 3 be given by the 

equation 

a(u, v ) — (u, v, [1 — u 2 — v 2 ] 1 ^ 2 )- 

Let y be the image set of a. Evaluate the integral over Y a of the 
form (1 / ||x|n (xi dx2 A dx 3 - x 2 dx y A dx z -f x 3 dx 1 A dx 2 ). 

(b) Repeat (a) when 

= (u, v , —[1 — u 2 — t> 2 ] 1/2 )- 

4. If r) is a fc-form in R fc , and if ai , . . . , a* is a basis for R fc , what is the 
relation between the integrals 

If] and [ r/(x) ((x; aj (x; a k )) ? 

JA Jx€A 

Show that if the frame (ai, . ajt) is orthonormal and right-handed, 
they are equal. 
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§34. ORIENTABLE MANIFOLDS 


We shall define the integral of a fc-form u over a fc-manifold M in much the 
same way that we defined the integral of a scalar function over M. First, 
we treat the case where the support of lj lies in a single coordinate patch 
a : U — ► V . In this case, we define 

/ lj = / a*u. 

Jm Jlnt U 

However, this integral is invariant under reparametrization only up to sign , 
Therefore, in order that the integral J M u be well-defined, we need an extra 
condition on M. That condition is called orientability. We discuss it in this 
section. 

Definition. Let g : A — * B be a diffeomorphism of open sets in R*. 
We say that g is orientation-preserving if det Dg > 0 on A. We say g is 
orientation- reversing if det Dg < 0 on A. 

This definition generalizes the one given in §20. Indeed, there is associated 
with g a linear transformation of tangent spaces, 

^:T x (R*)-T, (x) (R fc ), 

given by the equation £*( x i v ) = (flf(x); Dg(x) • v). Then g is orientation- 
preserving if and only if for each x, the linear transformation of R fc whose 
matrix is Dg is orientation-preserving in the sense previously defined. 

Definition. Let M be a ^-manifold in R”. Given coordinate patches 
a* : Ui — + Vi on M for i = 0,1, we say they overlap if Vo fl V\ is non- 
empty. We say they overlap positively if the transition function aj" 1 o Oo 
is orientation-preserving. If M can be covered by a collection of coordinate 
patches each pair of which overlap positively (if they overlap at all), then M 
is said to be orientable. Otherwise, M is said to be non-orientable. 

Definition. Let M be a fc-manifold in R n . Suppose M is orientable. 
Given a collection of coordinate patches covering M that overlap positively, 
let us adjoin to this collection all other coordinate patches on M that overlap 
these patches positively. It is easy to see that the patches in this expanded 
collection overlap one another positively. This expanded collection is called 
an orientation on M . A manifold M together with an orientation of M is 
called an oriented manifold. 
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This discussion makes no sense for a 0-manifold, which is just a discrete 
collection of points. We will discuss later what one might mean by “orienta- 
tion” in this case. 

If V is a vector space of dimension k, then V is also a fc-manifold. We 
thus have two different notions of what is meant by an orientation of V . An 
orientation of V was defined in §20 to be a collection of fc-frames in V; it is 
defined here to be a collection of coordinate patches on V . The connection 
between these two notions is easy to describe. Given an orientation of V in the 
sense of §20, we specify a corresponding orientation of V in the present sense 
as follows: For each frame (vi, . .., Vfc) belonging to the given orientation 
of V, the linear isomorphism a : R* — + V such that Qf(ej) = v* for each i is 
a coordinate patch on V. Two such coordinate patches overlap positively, as 
you can check; the collection of all such specifies an orientation of V in the 
present sense. 

Oriented manifolds in R n of dimensions 1 and n— 1 and n 

In certain dimensions, the notion of orientation has a geometric interpre- 
tation that is easily described. This situation occurs when k equals 1 or n — 1 
or n. In the case k = 1, we can picture an orientation in terms of a tangent 
vector field, as we now show. 

Definition. Let M be an oriented 1-manifold in R n . We define a corre- 
sponding unit tangent vector field T on M as follows: Given p E M , choose a 
coordinate patch a : U — ► V on M about p belonging to the given orientation. 
Define 

T(p) = (p; Da(t a )l\\Da(t 0 )\\ ), 

where to is the parameter value such that c*(2o) = p. Then T is called the 
unit tangent field corresponding to the orientation of M . 

Note that (p;Da(to)) is the velocity vector of the curve a corresponding 
to the parameter value t = to; then T( p) equals this vector divided by its 
length. 

We show T is well-defined. Let (3 be a second coordinate patch on M 
about p belonging to the orientation of M . Let p = /3(ti) and let g = 

Then g is a diffeomorphism of a neighborhood of to with a neighborhood of 
tiy and 

Da(t 0 ) = D(pog) (to) 

= Df3(h) ■ Dg{to). 

Now Dg(t 0 ) is a 1 by 1 matrix; since g is orientation-preserving, Dg(t 0 ) > 0. 
Then 

Da(to)l \\Da(t 0 )\\ = D/3(/i)/ ||£>/3((i)||. 

It follows that the vector field T is of class C°°, since to — is a 

C°° function of p and Da(t) is a C°° function of t. 
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EXAMPLE 1. Given an oriented 1-manifold M, with corresponding unit tan- 
gent field T , we often picture the direction of T by drawing an arrow on the 
curve M itself. Thus an oriented 1-manifold gives rise to what is often called 
in calculus a directed curve. See Figure 34.1. 



Figure 34-1 


A difficulty arises if Af has non-empty boundary. The problem is in- 
dicated in Figure 34.2, where dM consists of the two points p and q. If 
oc :U —* V is a coordinate patch about the boundary point p of M, the fact 
that U is open in H 1 means that the corresponding unit tangent vector T(p) 
must point into M from p. Similarly, T(q) points into M from q. In the 
1-manifold indicated, there is no way to define a unit tangent field on M that 
points into M at both p and q. Thus it would seem that M is not orientable. 
Surely this is an anomaly. 



The problem disappears if we allow ourselves coordinate patches whose 
domains are open sets in R 1 or H 1 or in the left half-line L 1 = {x|x < 0}. 
With this extra degree of freedom, it is easy to cover the manifold of the 
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previous example by coordinate patches that overlap positively. Three such 
patches are indicated in Figure 34.3. 



In view of the preceding example, we henceforth make the following con- 
vention: 

Convention. In the case of a 1 -manifold AT, we shall allow the 
domains of the coordinate patches on M to be open sets in R 1 or in H 1 
or in L 1 . 

It is the case that, with this extra degree of freedom, every 1-manifold is 
orientable. We shall not prove this fact. 

Now we consider the case where A/f is an n — 1 manifold in R n . In this 
case, we can picture an orientation of M in terms of a unit normal vector 
field to M. 

Definition. Let M be an n - 1 manifold in R n . If p G M , let (p; n) be 
a unit vector in the n-dimensional vector space 7^>(R n ) that is orthogonal to 
the n- 1 dimensional linear subspace T p (M). Then n is uniquely determined 
up to sign. Given an orientation of M , choose a coordinate patch a :U —* V 
on M about p belonging to this orientation; let tt(x) = p. Then the columns 
da fdxi of the matrix Da(x) give a basis 

(; p;da/dxi ), . .. , (p\da/dx n ^i) 

for the tangent space to M at p. We specify the sign of n by requiring that 
the frame 


(n,da/dxi, . . . , daldx n -i) 
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be right-handed, that is, that the matrix [n Z}q:(x)] have positive determi- 
nant. We shall show in a later section that n is well-defined, independent of 
the choice of a, and that the resulting function n(p) is of class C°°. The 
vector field JV(p) = (p;n(p)) is called the unit normal field to M corre- 
sponding to the orientation of M. 

EXAMPLE 2. We can now give an example of a manifold that is not orientable. 
The 2-manifold in R 3 that is pictured in Figure 34.4 has no continuous unit 
normal vector field. You can convince yourself of this fact. This manifold is 
called the Mobius band. 



Figure 34-4 


EXAMPLE 3. Another example of a non-orientable 2-manifold is the Klein 
bottle. It can be pictured in R 3 as the self-intersecting surface of Figure 34.5. 
We think of if as the space swept out by a moving circle, as indicated in the 
figure. One can represent K as a 2-manifold without self-intersections in R 4 
as follows: Let the circle begin at position Co, and move on to Ci,C2, and so 
on. Begin with the circle lying in the subspace R 3 x 0 of R 4 ; as it moves from 
Co to Ci, and on, let it remain in R 3 x 0. However, as the circle approaches 
the crucial spot where it would have to cross a part of the surface already 
generated, let it gradually move “up” into R 3 x H+ until it has passed the 
crucial spot, and then let it come back down gently into R 3 x 0 and continue 
on its way! 



Figure 34-5 


Figure 34-6 
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To see that K is not orientable, we need only note that K contains a 
copy of the Mobius band M. See Figure 34.6. If K were orientable, then M 
would be orientable as well. (Take all coordinate patches on M that overlap 
positively the coordinate patches belonging to the orientation of K.) 

Finally, let us consider the case of an n-manifold M in R™. In this case, 
not only is M orientable, but it in fact has a “natural” orientation: 

Definition. Let M be an n-manifold in R n . If a : U — ► V is a co- 
ordinate patch on M, then Da is an n by n matrix. We define the natu- 
ral orientation of A/ to consist of all coordinate patches on M for which 
det Da >0. It is easy to see that two such patches overlap positively. 

We must show M may be covered by such coordinate patches. Given 
p G M , let a : U — ► V be a coordinate patch about p. Now U is open in 
either R” or H"; by shrinking U if necessary, we can assume that U is either 
an open e-ball or the intersection with H" of an open e-ball. In either case, 
U is connected, so det Da is either positive or negative on all of U. If the 
former, then a is our desired coordinate patch about p; if the latter, then 
a o r is our desired coordinate patch about p, where r : R n — + R” is the map 

r(x i,x 2 , • . . , x n ) - (— , x 2 , • • x n ). 

Reversing the orientation of a manifold 

Let r : R fc — *• R fc be the reflection map 

r(x 1 , 32 , . *k) = (-X\,x 2 , •• zjk); 

it is its own inverse. The map r carries H* to if k > 1, and it carries H 1 
to the left half-line L 1 if k = 1. 

Definition. Let M be an oriented fc-manifold in R n . If a,* : Ui —> Vi 
is a coordinate patch on M belonging to the orientation of M , let /?,• be the 
coordinate patch 

=a,-or: r(D») -*■ V}. 

Then pi overlaps a t - negatively, so it does not belong to the orientation of M . 
The coordinate patches Pi overlap each other positively, however (as you can 
check), so they constitute an orientation of M . It is called the reverse, or 
opposite, orientation to that specified by the coordinate patches a,-. 

It follows that every orientable fc-manifold M has at least two orienta- 
tions, a given one and its opposite. If M is connected, it has only two (see 
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Figure 34.7 

the exercises). Otherwise, it has more than two. The 1-manifold pictured in 
Figure 34.7, for example, has four orientations, as indicated. 

We remark that if M is an oriented 1-manifold with corresponding tangent 
field T, then reversing the orientation of M results in replacing T by — T. 
For if a : U — + V is a coordinate patch belonging to the orientation of Af , 
then a o r belongs to the opposite orientation. Now (a o r)(/) = #(— t), so 
that d(a o r)/dt = —da / dt. 

Similarly, if M is an oriented n — 1 manifold in R rt with corresponding 
normal field N, reversing the orientation of M results in replacing N by —N. 
For if a : U -+V belongs to the orientation of M , then a o r belongs to the 
opposite orientation. Now 


d(a o r) 
dxi 


da 

dxi 


and 


d(a or) _ da 
dxi dxi 


if i > 1. 


Furthermore, one of the frames 

da da da 


( n , 


dxi’ dz 2 y ’ dx n _i 


) and (~n, — 


da da 


da 


dx i ’ dx 2 * ’ dx n _i 


) 


is right-handed if and only if the other one is. Thus if n corresponds to the 
coordinate patch a , then — n corresponds to the coordinate patch ao r. 


The induced orientation of dM 


Theorem 34.1. Let k > 1. If M is an orientable k-manifold with 
non-empty boundary, then dM is orientable. 

Proof. Let p 6 dM; let a : U — ► V be a coordinate patch about p. 
There is a corresponding coordinate patch a o on dM that is said to be ob- 
tained by restricting a. (See §24.) Formally, if we define b : R i- 1 — R* by 
the equation 

b(xi, . .. , z fc _i) = (a? 1? ... , 0), 

then ao = a o b. 

We show that if a and (3 are coordinate patches about p that overlap 
positively, then so do their restrictions Oq and /3q. Let g : Wq — + W\ be the 
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Figure 34-8 


transition function g = (3 1 o a, where Wo and W\ are open in H*. Then 
det Dg > 0. See Figure 34.8. 

Now if x 6 $H*, then the derivative Dg of g at x has the last row 
Dg k = [0 ••• 0 dg k /dx k ] 


where dg k /dx k > 0. For if one begins at the point x and gives one of the 
variables $i, . . . , x k -\ an increment, the value of g k does not change, while 
if one gives the variable x k a positive increment, the value of g k increases; it 
follows that dg k /dxj vanishes at x if j < k and is non-negative if j — k. 

Since det Dg ^ 0, it follows that dg k /dx k > 0 at each point x of dH k . 
Then because det Dg > 0, it follows that 


det 


&(g i ? • • • ? gk— i ) 
d(x u 


> 0 . 


But this matrix is just the derivative of the transition function for the coor- 
dinate patches cxq and /3 0 on dM . □ 


The proof of the preceding theorem shows that, given an orientation of M, 
one can obtain an orientation of dM by simply taking restrictions of coordi- 
nate patches that belong to the orientation of M . However, this orientation 
of dM is not always the one we prefer. We make the following definition: 


Definition. Let M be an orientable fc-manifold with non-empty bound- 
ary. Given an orientation of M , the corresponding induced orientation of 
dM is defined as follows: If k is even, it is the orientation obtained by simply 
restricting coordinate patches belonging to the orientation of M . If k is odd, 
it is the opposite of the orientation of dM obtained in this way. 
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EXAMPLE 4. The 2-sphere S 2 and the torus T are orientable 2-manifolds, 
since each is the boundary of a 3-manifold in R 3 , which is orientable. In 
general, if M is a 3-manifold in R 3 , oriented naturally, what can we say about 
the induced orientation of #M? It turns out that it is the orientation of d M 
that corresponds to the unit normal field to dM pointing outwards from the 
3-manifold M. We give an informal argument here to justify this statement, 
reserving a formal proof until a later section. 

Given M, let a : U — * V be a coordinate patch on M belonging to the 
natural orientation of M, about the point p of dM. Then the map 

(aob)(x) = a(x u x 2 ,0) 

gives the restricted coordinate patch on dM about p. Since dim M — 3, 
which is odd, the induced orientation of dM is opposite to the one obtained 
by restricting coordinate patches on M. Thus the normal field N = (p;n) 
to dM corresponding to the induced orientation of M satisfies the condition 
that the frame (-n, da/dx ^ , da/dx 2 ) is right-handed. 

On the other hand, since M is oriented naturally, det Da > 0. It follows 
that (da/dx 3 ,daj dx\,da/dx 2 ) is right-handed. Thus — n and dafdx 3 lie 
on the same side of the tangent plane to M at p. Since da/dx 3 points 
into M, the vector n points outwards from M. See Figure 34.9. 



Figure 34.9 


EXAMPLE 5. Let M be a 2-manifold with non-empty boundary, in R 3 . If 
M is oriented, let us give dM the induced orientation. Let N be the unit 
normal field to M corresponding to the orientation of M; and let T be the 
unit tangent field to dM corresponding to the induced orientation of dM. 
What is the relationship between N and T? 
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Figure 34-10 


We assert the following: Given N and T, for each p £ dM let W (p) be 
the unit vector that is perpendicular to both N(p) and T(p), chosen so that 
the frame (7V(p), T( p), W (p)) is right-handed. Then XV (p) is tangent to M 
at p and points into M from dM. 

(This statement is a more precise way of formulating the description usu- 
ally given in the statement of Stokes’ theorem in calculus: “The relation be- 
tween N and T is such that if you walk around dM in the direction specified 
by T , with your head pointing in the direction specified by N, then the man- 
ifold M is on your left.” See Figure 34.10.) 

To verify this assertion, let a : U — ► V be a coordinate patch on M about 
the point p of dM, belonging to the orientation of M. Then the coordinate 
patch aob belongs to the induced orientation of dM. (Note that dim M = 2, 
which is even.) The vector dafdx i represents the velocity vector of the 
parametrized curve on o b\ hence by definition it points in the same direction 
as the unit tangent vector T. 

The vector doc/dx 2 , on the other hand, is the velocity of a parametrized 
curve that begins at a point p of dM and moves into M as t increases. 
Thus, by definition, it points into M from p. Now dafdx 2 need not be 
orthogonal to M. But we can choose a scalar A such that the vector w = 
da/dx 2 + \dafdx\ is orthogonal to dafdx 1 and hence to T. Then w also 
points into M; set kF(p) = (p;w/ ||w|| ). 

Finally, the vector A^(p) = (p;n) is, by definition, the unit vector normal 
to M at p such that the frame (n,da/dx\,da/dx 2 ) is right-handed. Now 

det[n dafdx 1 dafdx 2 ] = det[n dafdx\ w], 
by direct computation. It follows that the frame (TV, T, W) is right-handed. 
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EXERCISES 

1. Let M be an n-manifold in R n . Let a,/? be coordinate patches on M 
such that det Doc > 0 and det D(5 > 0. Show that oc and overlap 
positively if they overlap at all. 

2. Let M be a fc-manifold in R”; let a, ft be coordinate patches on M . Show 
that if ar and /? overlap positively, so do o o r and j3 o r. 

3. Let M be an oriented 1-manifold in R 2 , with corresponding unit tan- 
gent vector field T. Describe the unit normal field corresponding to the 
orientation of M. 

4. Let C be the cylinder in R 3 given by 

C = {(x,y, z) | x 2 + y 2 = 1 ; 0 < z < 1 }. 

Orient C by declaring the coordinate patch oc : (0, l) 2 — ► C given by 
oc(u, v) — (cos27ru,sin27ru,u) 

to belong to the orientation. See Figure 34.11. Describe the unit normal 
field corresponding to this orientation of C. Describe the unit tangent 
field corresponding to the induced orientation of dC. 




Figure 34 . 11 


5. Let M be the 2-manifold in R 2 pictured in Figure 34.12, oriented nat- 
urally. The induced orientation of dM corresponds to a unit tangent 
vector field; describe it. The induced orientation of dM also corresponds 
to a unit normal field; describe it. 


6. Show that if M is a connected orientable fc-manifold in R n , then M 
has precisely two orientations, as follows: Choose an orientation of M; 
it consists of a collection of coordinate patches {f*,}. Let {/?^} be an 
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Figure 34-12 


M 


arbitrary orientation of Af. Given x 6 Af, choose coordinate patches (Hi 
and fa about x and define A(x) = 1 if they overlap positively at x, and 
A(x) = — 1 if they overlap negatively at x. 

(a) Show that A(x) is well-defined, independent of the choice of and 

ft- 

(b) Show that A is continuous. 

(c) Show that A is constant. 

(d) Show that {fi 3 } gives the opposite orientation to {e^} if A is identi- 
cally — 1, and the same orientation if A is identically 1. 

7. Let M be the 3-manifold in R 3 consisting of all x with 1 < ||x|| < 2. 
Orient M naturally. Describe the unit normal field corresponding to the 
induced orientation of dM . 

8. Let B n — B n (l) be the unit ball in R n , oriented naturally. Let the unit 
sphere 5 n_1 = dB n have the induced orientation. Does the coordinate 
patch a : Int B n ~ x — ► 5 n_1 given by the equation 


a(u) = (u,[l-||u|| 2 ] 1/2 ) 

belong to the orientation of 5 n_1 ? What about the coordinate patch 

ftu) = (u,-[l-||uf] 1/2 )? 
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§35. INTEGRATING FORMS OVER ORIENTED MANIFOLDS 


Now we define the integral of a Ar-form u over an oriented Ar-manifold. The 
procedure is very similar to that of §25, where we defined the integral of a 
scalar function over a manifold. Therefore we abbreviate some of the details. 

We treat first the case where the support of u> can be covered by a single 
coordinate patch. 

Definition. Let M be a compact oriented Ar-manifold in R n . Let wbe a 
Ar-form defined in an open set of R n containing M . Let C = Mn(Support u)\ 
then C is compact. Suppose there is a coordinate patch a : U —* V on M 
belonging to the orientation of M such that C C V. By replacing U by a 
smaller open set if necessary, we can assume that U is bounded. We define 
the integral of lj over M by the equation 

[ u — f a*LJ. 

Jm J Int u 

Here Int U = U if U is open in R k , and Int U = U fl H+ if U is open in 
but not in R*. 


First, we note that this integral exists as an ordinary integral, and hence 
as an extended integral: Since a can be extended to a C°° map defined on 
a set U' open in R* 1 , the form a*LJ can be extended to a C°° form on U' . 
This form can be written as hdx i A • • • A dxk for some C°° scalar function h 
on U*. Thus 

f a*w = f h, 

J Int u J Int u 

by definition. The function h is continuous on U and vanishes on U outside 
the compact set a~ l (C)\ hence h is bounded on U. If U is open in R fc , then 
h vanishes near each point of Bd U . If U is not open in R fc , then h vanishes 
near each point of Bd U not in a set that has measure zero in R*. In 
either case, h is integrable over U and hence over Int U . See Figure 35.1. 

Second, we note that the integral f M is well-defined, independent of 
the choice of the coordinate patch a. The proof is very similar to that of 
Lemma 25.1; here one uses the additional fact that the transition function is 
orientation-preserving, so that the sign in the formula given in Theorem 33.1 
is “plus.” 
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Third, we note that this integral is linear. More precisely, if uj and T} have 
supports whose intersections with M can be covered by the single coordinate 
patch a : U — ► V belonging to the orientation of M , then 




+ 





This result follows at once from the fact that a * and f Jnt are linear. 

Finally, we note that if —M denotes the manifold M with the opposite 
orientation, then 




UJ. 


This result follows from Theorem 33.1. 

To define $ M UJ in general, we use a partition of unity. 

Definition. Let M be a compact oriented fc-manifold in R n . Let uj be 
a fc-form defined in an open set of R” containing M . Cover M by coordinate 
patches belonging to the orientation of M ; then choose a partition of unity 
<^, . . . , <f>t on M that is dominated by this collection of coordinate patches 
on M . See Lemma 25.2. We define the integral of a; over M by the equation 
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The fact that this definition agrees with the previous one when the support 
of to is covered by a single coordinate patch follows from linearity of the earlier 
integral and the fact that 

t 

<j(x) = ^ <M X M X ) 
i = l 

for each x € M. The fact that the integral is independent of the choice of the 
partition of unity follows by the same argument that was used for the integral 
f M f dV . The following is also immediate: 


Theorem 35.1. Let M be a compact oriented k-manifold in R n . 
Let u,T] be k-forms defined in an open set of R n containing M. Then 

/ (au + brj) = a lj + b r). 

Jm Jm Jm 

If —M denotes M with the opposite orientation, then 


I —J 

J-m Jm 


UJ. □ 


This definition of the integral is satisfactory for theoretical purposes, but 
not for computational purposes. As in the case of the integral f M f dV, the 
practical way of evaluating f M lj is to break M up into pieces, integrate over 
each piece separately, and add the results together. We state this fact formally 
as a theorem: 

*Theorem 35.2. Let M be a compact oriented k-manifold in R n . 
Let u be a k-form defined in an open set of R n containing M. Suppose 
that oci : A{ — ► Mi, for i — 1, ..., N, is a coordinate patch on M be- 
longing to the orientation of M, such that A, is open in R* and M is 
the disjoint union of the open sets A/i , . . . , M N of M and a set K of 
measure zero in M. Then 



<*>]• 


Proof The proof is almost a copy of the proof of Theorem 25.4. Al- 
ternatively, it follows from Theorems 25.4 and 36.2. We leave the details 
to you. D 
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EXERCISES 

1. Let M be a compact oriented fc-manifold in R n . Let a? be a fc-form defined 
in an open set of R" containing M . 

(a) Show that in the case where the set C — M H (Support u>) is covered 
by a single coordinate patch, the integral J M u> is well-defined. 

(b) Show that the integral f M oj is well-defined in general, independent of 
the choice of the partition of unity. 

2. Prove Theorem 35.2. 

3. Let S" -1 be the unit sphere in R n , oriented so that the coordinate patch 
a : A — S" -1 given by 

<*(u) = (u, [l - ||u|| 2 ] 1/2 ) 

belongs to the orientation, where A = Int B n ~ l . Let T] be the n — 1 form 

n 

T) — ( — \) x ~ l fi dx\ A • • • A dx t A • • • A dx n , 

i-l 

where /i(x) — x x f |jx|| m . The form 77 is defined on R n — 0. Show that 

/ V ? 0 . 

J s n - 1 


as follows: 

(a) Let p : R n — * R n be given by 


p(x 1 , ..., X n —\ Xn) — (*£lj •••) <En — 1 > • 

Let /? = p o a. Show that (5 : A — ■ 5 n_1 belongs to the opposite 
orientation of S"" 1 . [Hint: The map p : B n — ► B n is orientation- 
reversing.] 

(b) Show that P*rj — conclude that 


L 


1 ~ 



a*T}. 


f a*r} = ± f 1/[1 - H| 2 ] 1/2 / 0- 
J A JA 


(c) Show that 
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*§36. A GEOMETRIC INTERPRETATION OF FORMS AND INTEGRALS 


The notion of the integral of a /:-form over an oriented /^-manifold seems 
remarkably abstract. Can one give it any intuitive meaning? We discuss here 
how it is related to the integral of a scalar function over a manifold, which is 
a notion closer to our geometric intuition. 

First, we explore the relationship between alternating tensors in R n and 
the volume function in R n . 

Theorem 36.1. Let W be a k-dimensional linear subspace of R n ; 
let (a 1? ..., a fc ) be an orthonormal k-frame in W , and let f be an al- 
ternating k-tensor on W. 7/(xi, ..., x*) is an arbitrary k-tuple in W, 
then 

/(xi, ..., xjb) = 6 V^xi, ..., x*)/(ai, ..., a fc ), 

where e = ±1. If the x, are independent, then e = +1 if the frames 
(xi, ..., xfc) and (ai, ..., a*) belong to the same orientation of W and 
e = —l otherwise . 

If the x t - are dependent, then V(xi, . . . , x*) = 0 by Theorem 21.3 and 
the value of e does not matter. 

Proof. Step 1. If W = R*, then the theorem holds. In that case, the 
fc-tensor / is a multiple of the determinant function, so there is a scalar C 
such that for all fc-tuples (xi, . . . , x*) in R*, 


/(x i, . . . , xt) = cdet[xi • • ■ xt]. 

If the x, are dependent, it follows that / vanishes; then the theorem holds 
trivially. Otherwise, we have 

/(xi, ..., Xfc) = cdet[xi ■ x*] = c€iV(xi, ..., x fc ), 

where = +1 if (x 1? . . . , x k ) is right-handed, and = -1 otherwise. Simi- 
larly, 

/( ai, ..., a fc ) = ce 2 V( at, . .., a*) = ce 2 , 

where e 2 = -1-1 if (aj, a*) is right-handed and e 2 = — 1 otherwise. It 
follows that 

/(xi, ...» xt) = {ei/e 2 )V (x 1? ..., x k )f ( ..., a*), 

where Ci/e 2 = +1 if (xi, . . . , x*,) and (a^ . . . , a*) belong to the same orien- 
tation of R*, and o/e 2 = -1 otherwise. 
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Step 2. The theorem holds in general. Given W, choose an orthogonal 
transformation h : R n — * R n carrying W onto R k x 0. Let k : R fc x 0 — ► W 
be the inverse map. Since / is an alternating tensor on W y it is mapped to 
an alternating tensor k* f on R fc X 0. Since (/i(xi), . . . , /i(xjt)) is a Ar-tuple in 
R* x 0, and (h( ai), . . . , /i(ajt)) is an orthonormal fc-tuple in R fc x 0, we have 
by Step 1, 

(k*f)(h(x i), h(x k )) ~eV(h(x i), ..., h(x t ))(k*f)(h( ai), ...,h(a k )), 

where e = ±1. Since V is unchanged by orthogonal transformations, we can 
rewrite this equation as 

f(x i, . Xi) = c V^(xi, x fc )/(ai, a*), 

as desired. Now suppose the Xj are independent. Then the /i(x,) are inde- 
pendent, and by Step 1 we have € — +1 if and only if (/i(xi), , . . , h(xk)) and 
(h( ai), , /i(a*;)) belong to the same orientation of R k X 0. By definition, 

this occurs if and only if (xi, . . . , Xk ) and (ai, . . . , at) belong to the same 
orientation of W. □ 

Note that it follows from this theorem that if (ai,...,at) and 
(bi, . . . , bt) are two orthonormal frames in W, then 

/( ai , . . . , at) = ±/(b x , . . . , bt), 

the sign depending on whether they belong to the same orientation of W 
or not. 

Definition. Let M be a fc-manifold in R n ; let p € M. If M is oriented, 
then the tangent space to M at p has a natural induced orientation, defined as 
follows: Choose a coordinate patch a : U —+ V belonging to the orientation 
of M about p. Let a(x) = p. The collection of all fc-frames in T P (M) of 
the form 

(a,(x; ai ), ..., a*(x;a t )) 

where (ai, . . . , a*) is a right-handed frame in R fc , is called the natural orien- 
tation of Tp(A/), induced by the orientation of M. It is easy to show it is 
well-defined, independent of the choice of a. 

Theorem 36.2. Let M be a compact oriented k-manifold in R n ; 
let uj be a k-form defined in an open set of R” containing M. Let A be 
the scalar function on M defined by the equation 


A(p) = a;(p)((p;a 1 ), ..., (p;a*)), 
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where ((pja^, . . . , (p;a fc )) is any orthonormal frame in the linear space 
T P (M) belonging to its natural orientation. Then A is continuous, and 



A dV. 


Proof. By linearity, it suffices to consider the case where the support 
of (j is covered by a single coordinate patch a : U — ► V belonging to the 
orientation of M . We have 


a*u = h dx i A • • • A dxk 

for some scalar function h. Let e*(x) = p. We compute h(x) as follows: 

/i(x) = (a*w)(x)((x;e 1 ), ...» (x;e t )) 

= ^(t*(x))(a,(x;e 1 ), . a*(x;e fc )) 

= v(p)((p,da/dxi), . . . , (p; da/dx k )) 

= ±V(Da(x)) A(p), 

by Theorem 36.1. The sign is “plus” because the frame 

((p; da/ dx t ), . . . , (p; da/dx k )) 

belongs to the natural orientation of T p (M) by definition. Now V(Da) ^ 0 
because Da has rank k. Then since x = a - 1 (p) is a continuous function 
of p, so is 

A(p) = h(x)/V(Da(x)). 

It follows that 

/ Ad / (A o a)V(Da) = / h. 

JM J Int U J Int U 

On the other hand, 

u= a*u) - I h, 

JM J Int U J\nt U 

by definition. The theorem follows. □ 


This theorem tells us that, given a £-form u defined in an open set about 
the compact oriented fc-manifold M in R n , there exists a scalar function A 
(which is in fact of class C°° ) such that 
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The reverse is also true, but the proof is a good deal harder: 

One first shows that there exists a fc-form c<J v , defined in an open set 
about M, such that the value of u>„(p) on any orthonormal basis for T P (M) 
belonging to its natural orientation is 1. Then if A is any C°° function on M , 


we have 



A dV = 



Xu v ] 


thus the integral of A over M can be interpreted as the integral over M of 
the form A u v . The form u v is called a volume form for M, since 


/ dV = v(M). 

M 

This argument applies, however, only if M is orientable. If M is not 
orientable, the integral of a scalar function is defined, but the integral of a 
form is not. 

A remark on notation. Some mathematicians denote the volume form u) v 
by the symbol dV, or rather by the symbol dV. (See the remark on notation 
in §22.) While it makes the preceding equations tautologies, this practice can 
cause confusion to the unwary, since V is not a form, and d does not denote 
the differential operator in this context! 



EXERCISE 

1. Let M be a fc-manifold in R n ; let p 6 M. Let a and (3 be coordinate 
patches on M about p; let ck(x) = p = (3{y). Let (ai , ..., a*) be a 
right-handed frame in R k . If or and (3 overlap positively, show that there 
is a right-handed frame (bi, . . . , b*) in R fc such that 

a,(x;a,) = 0*(y; b .) 

for each i. Conclude that if M is oriented, then the natural orientation 
of T P (M) is well-defined. 
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§37. THE GENERALIZED STOKES’ THEOREM 


Now, finally, we come to the theorem that is the culmination of all our labors. 
It is a general theorem about integrals of differential forms that includes the 
three basic theorems of vector integral calculus — Green’s theorem, Stokes’ 
theorem, and the divergence theorem — as special cases. 

We begin with a lemma that is in some sense a very special case of the 
theorem. Let I k denote the unit fc-cube in R fc ; 

/* = [ 0,1]* = [0,1] x - x [0,1]. 

Then Int I k is the open cube (0, 1)‘, and Bd I k equals I k — Int I 1 . 


Lemma 37.1. Let k > 1. Let ij be a k — l form defined in an open 
set U of R k containing the unit k-cube I k . Assume that ij vanishes at 
all points of Bd I k except possibly at points of the subset (Int I k ~ x ) x 0. 
Then 


where b : I k ~ l 


f dr) = (-l) k f 
J Int I k J Ir 

I k is the map 


Int /*-» 


6(ui, . .., i) = (tii, ..., u*_i,0). 


Proof. We use x to denote the general point of R fc , and u to denote the 
general point of R*" 1 . See Figure 37.1. Given j with 1 < j < k, let Ij denote 
the k — l tuple 

Ij — ( 1 , . . . , j , • • • , k} . 

Then the typical elementary k - 1 form in R* is the form 
dxj j = dx i A • • • A dxj A ■ • • A dx k . 


Because the integrals involved are linear and the operators d and b* are 
linear, it suffices to prove the lemma in the special case 

t} = f dx Ijy 

so we assume this value of r) in the remainder of the proof. 

Step 1. We compute the integral 
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Figure 37.1 


We have 


dr} — df A dxjj 

k 

= (J2 Dif dxi) A dx Jj 


»'=1 


= (— 1) J l (Djf)dx\ A • • • A dx k . 


Then we compute 


•/Int J k 


= (-ir 1 / Djf 

Ji k 


= (-ir 1 f 

Jv^I k ~ 1 

/ Djfix 

i, where v = (xi, . 

• • 1 X j 7 • • • J 


mental theorem of calculus, we compute the inner integral as 

[ Djf{x i, Xk) = f (x\, 1, Xf.) — f(x i, . . . , 0, . . . , 

Jxje i 

where the 1 and the 0 appear in the j th place. Now the form T), and hence 
the function /, vanish at all points of Bd I k except possibly at points of the 
open bottom face (Int / fe_1 ) X 0. If j < k t this means that the right side of 
this equation vanishes; while if j — k , it equals 

• ••» ajfc— 1,0). 
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We conclude the following: 

0 if j < k, 

(~ l ) k h-iif ob ) if j = k - 



Step 2. Now we compute the other integral of our theorem. The map 
b : R fc_1 — * R* has derivative 


Db = 


h - 1 


0 


Therefore, by Theorem 32.2, we have 

b*{dxj.) = [det Db( 1, . . . , J, . . . , k)] dui A • • • A duk~\ 


( 0 if j < k, 

[ du\ A • • • A duk-x if j = k. 


We conclude that 

0 if j < k, 

flnt /fc- 1 (/ ° ^) j = k- 



The theorem follows by comparing this equation with that at the end of 
Step 1. □ 


Theorem 37.2 (Stokes’ theorem). Let k > 1. Let M be a com- 
pact oriented k-manifold in R n ; give dM the induced orientation if 
dM is not empty. Let u be a k — 1 form defined in an open set of 
R n containing M. Then 

duj= u 
Jm J dM 

if dM is not empty; and f M duj = 0 if dM is empty. 

Proof. Step 1. We first cover M by carefully-chosen coordinate patches. 
As a first case, assume that p £ M - dM. Choose a coordinate patch 
a : U — ► V belonging to the orientation of M, such that U is open in R* 
and contains the unit cube /*, and such that a carries a point of Int I k to 
the point p. (If we begin with an arbitrary coordinate patch a : U — + V 
about p belonging to the orientation of M , we can obtain one of the desired 
type by preceding a by a translation and a stretching in R k .) Let W = Int I k , 
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Figure 37.2 


and let Y = a(VF). Then the map a : W -+ Y is still a coordinate patch 
belonging to the orientation of M about p, with W — Int I k open in R*\ See 
Figure 37.2. We choose this special patch about p. 

As a second case, assume that p G dM . Choose a coordinate patch 
Ot : U — ► V belonging to the orientation of Af, such that U is open in H 
and U contains I k , and such that a carries a point of (Int I k ~ 1 ) X 0 to the 
point p. Let 

W = (Int /*) U ((Int Z*- 1 ) x 0), 

and let Y = a(W). Then the map a : W -* Y is still a coordinate patch 
belonging to the orientation of M about p, with W open in but not open 
inR*. 

We shall use the covering of M by the coordinate patches a : W Y 
to compute the integrals involved in the theorem. Note that in each case, the 
map a can be extended if necessary to a C°° function defined on an open set 
of R fc containing I k . 

Step 2 . Since the operator d and the integrals involved are linear, it 
suffices to prove the theorem in the special case where u is a k — 1 form such 
that the set 

C = M fl (Support cj) 

can be covered by a single one of the coordinate patches ot : W ► Y . Since 
the support of du) is contained in the support of u, the set M D (Support du) 
is contained in C, so it is covered by the same coordinate patch. 

Let 7] denote the form a*u. The form T) can be extended if necessary 
(without change of notation) to a C°° form on an open set of R* containing I k . 
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Furthermore, 7] vanishes at all points of Bd I k except possibly at points of 
the bottom face (Int I k ~ x ) x 0. Thus the hypotheses of the preceding lemma 
are satisfied. 

Step 3. We prove the theorem when C is covered by a coordinate patch 
a : W — > Y of the type constructed in the first case. Here W — Int I k and Y 
is disjoint from dM . We compute the integrals involved. Since a*du — 
da*u — drp we have 

f du = f a*du = f drj _ (~l) k [ b*rj. 

JM ./Int l k J Int /* ./int / fc “ x 

Here we use the preceding lemma. In this case, the form T) vanishes outside 
Int I k . In particular, rj vanishes on I k ~ 1 x 0, so that b*T] = 0. Then f M du = 0. 

The theorem follows. If dM is empty, this is the equation we wished to 
prove. If DM is non-empty, then the equation 


I doJ = I 

J M J dM 


U 


holds trivially; for since the support of u is disjoint from dM , the integral 
of u over dM vanishes. 

Step J. Now we prove the theorem when C is covered by a coordinate 
patch a : W — > Y of the type constructed in the second case. Here W is 
open in but not in R*, and Y intersects dM . We have Int W = Int I k . 
We compute as before 


f du>= f drt = (- 1)* / t*i). 

JM Jlnt J k J Int Z *" 1 

We next compute the integral f dM u. The set dM fl (Support u) is 
covered by the coordinate patch 

j3 = aob : Int I k ~ l - >YndM 

on dM , which is obtained by restricting a. Now (3 belongs to the induced 
orientation of dM if k is even, and to the opposite orientation if k is odd. If 
we use (3 to compute the integral of u over dM , we must reverse the sign of 
the integral when k is odd. Thus we have 


/ w = (-l )‘ / fi 

J dM Jlnt J k ~ l 


U. 


Since {3*u = b*(a*u ) = b*rj, the theorem follows. □ 
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We have proved Stokes’ theorem for manifolds of dimension k greater 
than 1. What happens if k = 1? If dM is empty, there is no problem; one 
proves readily that f M du — 0. However, if dM is non-empty, we face the 
following questions: What does one mean by an “orientation” of a 0-manifold, 
and how does one “integrate” a 0-form over an oriented 0-manifold? 

To see what form Stokes’ theorem should take in this case, we consider 
first a special case. 

Definition. Let M be a 1-manifold in FT. Suppose there is a one-to-one 
map a : [a, 6] — ► M of class C°°, carrying [a, b] onto M, such that Da(t) ^ 0 
for all t. Then we call M a (smooth) arc in R n . If M is oriented so that 
the coordinate patch a\(a y b) belongs to the orientation, we say that p is the 
initial point of M and q is the final point of M. See Figure 37.3. 



Figure 37.3 


*Theorem 37.3. 
defined in an open set 
final point q, then 


Let M be a 1 -manifold in R n ; let f be a 0-/orm 
about M. If M is an arc with initial point p and 

[ df = /(q) — /(p)* 

Jm 


Proof. Let a : [a, 6] — ► M be as in the preceding definition. Then 
ol : (o, 6) — ► M — p — q is a coordinate patch covering all of M except for a 
set of measure zero. By Theorem 35.2, 



a*(df). 
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Now 

a*(df) ~ d(f o(t) = D(f o ct) dt, 
where t denotes the general point of R. Then 

/ <**(<//)=/ D(f o a) = f (a(b)) — / (a(a)) , 

by the fundamental theorem of calculus. □ 

This result provides a guide for formulating Stokes’ theorem for 1-mani- 
folds: 

Definition. A compact 0-manifold N in R” is a finite collection of 
points {xi, . , x m } of R n . We define an orientation of TV to be a func- 
tion € mapping N into the two-point set {—1,1}. If / is a 0-form defined 
in an open set of R n containing N , we define the integral of / over the 
oriented manifold N by the equation 

m 

f = £ f(x.)/(x.). 

1 = 1 

Definition. If M is an oriented 1-manifold in R n with non-empty bound- 
ary, we define the induced orientation of dM by setting e(p) = —1, for 
p € dM, if there is a coordinate patch a : U — ► V on M about p belonging 
to the orientation of M, with U open in H 1 . We set e(p) = +1 otherwise. 
See Figure 37.4. 




Figure 37.4 


With these definitions, Stokes’ theorem takes the following form; the proof 
is left as an exercise. 
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Theorem 37.4 (Stokes’ theorem in dimension 1). Let M be a 
compact oriented 1-manifold in R n ,* give dM the induced orientation 
if DM is not empty . Let f be a 0 -form defined in an open set of R n 
containing M. Then 

J df = I f 

JM JdM 

if dM is not empty; and f M df = 0 if dM is empty . □ 


EXERCISES 


1. Prove Stokes’ theorem for 1-manifolds. [Hint: Cover M by coordinate 
patches, belonging to the orientation of M, of the form a : W — * Y , 
where W is one of the intervals (0,1) or [0,1) or (-1,0]. Prove the theorem 
when the set M n (Support /) is covered by one of these coordinate 
patches.] 

2. Suppose there is an n — 1 form g defined in R n — 0 such that dg = 0 and 



g # o. 


Show that g is not exact. (For the existence of such an g, see the exercises 
of §30 and the exercises of either §35 or §38.) 

3. Prove the following: 

Theorem (Green’s theorem). Let M be a compact 2-manifold 
in R 2 , oriented naturally; give dM the induced orientation. Let 
P dx -f* Q dy be a l-/orm defined in an open set of R 2 about M. 
Then 


f Pdx + Qdy= f {DiQ - D 2 P) dx Ady. 

J QM d M 

4. Let M be the 2-manifold in R 3 consisting of all points x such that 
4 (£i) 2 + (X 2) 2 + 4(z 3 ) 2 = 4 and x 2 > 0. 


Then dM is the circle consisting of all points such that 
(xi) 2 + (X3) 2 = 1 and z 2 = 0. 


See Figure 37.5. The map 

a(«, v) = ( u , 2[1 - u 2 - i; 2 ] 1/2 ,i;), 

for u 2 + v 2 < 1, is a coordinate patch on M that covers M — dM. 
Orient M so that Cx belongs to the orientation, and give dM the induced 
orientation. 
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(a) What normal vector corresponds to the orientation of M? What 
tangent vector corresponds to the induced orientation of dM ? 

(b) Let u be the 1-form u> = X2 dx\ +3xi dx 3. Evaluate f u> directly. 

(c) Evaluate J du> directly, by expressing it as an integral over the unit 
disc in the ( u , u) plane. 

5. The 3-ball B 3 (r ) is a 3-manifold in R 3 ; orient it naturally and give S 2 (r ) 
the induced orientation. Assume that u is a 2-form defined in R 3 — 0 such 


that 



a + (6/r), 


for each r > 0. 

(a) Given 0 < c < d, let M be the 3-manifold in R 3 consisting of all x 
with c < ||x|| < d, oriented naturally. Evaluate J M du, 

(b) If du> = 0, what can you say about a and 6? 

(c) If U) = dr] for some T] in R 3 — 0, what can you say about a and b? 

6. Let M be an oriented k + C -\- 1 manifold without boundary in R”. Let u> 
be a fc-form and let rj be an t!-form, both defined in an open set of R" 
about M . Show that 


/ u> A dr] = a I du A 7/ 

jm J M 


for some a, and determine a. 
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*§38. APPLICATIONS TO VECTOR ANALYSIS 


In general, we know from the discussion in §31 that differential forms of order k 
can be interpreted in R n in certain cases as scalar or vector fields, namely when 
fc = 0, l,ra — l,orra. We show here that integrals of forms can similarly be so 
interpreted; then Stokes’ theorem can also in certain cases be interpreted in 
terms of scalar or vector fields. These versions of the general Stokes’ theorem 
include the classical theorems of the vector integral calculus. 

We consider the various cases one-by-one. 


The gradient theorem for 1-manifolds in R n 

First, we interpret the integral of a 1-form in terms of vector fields. If F 
is a vector field defined in an open set of R n , then F corresponds under the 
“translation map” a i to a certain 1-form a;. (See Theorem 31.1.) It turns out 
that the integral of u over an oriented 1-manifold equals the integral, with 
respect to 1-volume, of the tangential component of the vector field F. That 
is the substance of the following lemma: 


Lemma 38.1. Let M be a compact oriented 1 -manifold in R n ; 
let T be the unit tangent vector to M corresponding to the orientation. 
Let 

F(x) = (x;/(x)) = (x;E/,(x)e,) 

be a vector field defined in an open set of R n containing M; it corre- 
sponds to the l-/orm 

LJ = Y2 fi dx i • 


Then 

(F,T) ds. 

JM JM 


Here we use the classical notation “ds” rather than “dF” to denote the 
integral with respect to 1-volume (arc length), simply to make our theorems 
resemble more closely the classical theorems of vector integral calculus. 

Note that if one replaces M by — M , then the integral f M lj changes 
sign. This replacement has the effect of replacing T by — T; thus the integral 
f M (F, T) ds also changes sign. 

Proof. We give two proofs of this lemma. The first relies on the results 
of §36; the second does not. 

First proof. By Theorem 36.2, we have 
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where A(p) is the value of u;(p) on an orthonormal basis for T p (M) that 
belongs to the natural orientation of this tangent space. In the present case, 
the tangent space is 1-dimensional, and T( p) is such an orthonormal basis. 
Let T(p) — (p;t). Since lj — S/ t - dxi , 

w(p)(p;t) = 

Thus 

Mp) = (^(pj^tp)), 

and the lemma follows. 

Second proof. Since the integrals involved are linear in lj and F, re- 
spectively, it suffices to prove the lemma in the case where the set 

C — M fl (Support lj) 

lies in a single coordinate patch a : U —* V belonging to the orientation of M . 
In that case, we simply compute both integrals. Let t denote the general point 
of R. Then 

n 

= ]P(/i o a) don 

»=i 


It follows that 


On the other hand, 


= X^(/» °ot)(Dai) dt 

i- 1 

— (f o a, Da) dt. 

I lj — I a* lj 

JM J Int U 

= [ {foa,Da). 

J Int U 


[ {F,T)d8= f (Foa,Toa)-V(Da) 

JM Jlnt U 

— f (f o a, Da/ \\Da\\) -V(Da) 

J Int U 

= f {/ o a, Da), 

J Int U 


since 


V{Da) = [det(Da tr • Da )]^ 2 = \\Da\\. 
The lemma follows. □ 
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Theorem 38.2 (The gradient theorem). Let M be a compact 
1-manifold in R"; let T be a unit tangent vector field to M. Let f be a 
C°° function defined in an open set about M. If dM is empty, then 

f (grad f,T) ds = 0. 

JM 

If dM consists of the points xi, . . . , x m , let e t = -1 if T points into M 
at Xi and e, = -f 1 otherwise. Then 


/ (grad f,T) 
JM 


1 = 1 


Proof. The 1-form df corresponds to the vector field grad /, by Theo- 
rem 31.1. Therefore 

f df = f (grad f,T) ds, 

JM JM 

by the preceding lemma. Our theorem then follows from the 1-dimensional 
version of Stokes’ theorem. □ 


The divergence theorem for n— 1 manifolds in R n 

Now we interpret the integral of an n — 1 form, over an oriented n — 1 
manifold M , in terms of vector fields. First, we must verify a result stated 
earlier, the fact that an orientation of M determines a unit normal vector 
field to M. Recall the following definition from §34: 


Definition. Let M be an oriented n—1 manifold in R n . Given p € M, 
let (p; n) be a unit vector in T P (R") that is orthogonal to the n—1 dimensional 
linear subspace T p (M). If a : U — ► V is a coordinate patch on M about p 
belonging to the orientation of M with cr(x) = p, choose n so that 


n 


da 
’ dx x 


(x), 


da 


dx n . i 


M) 


is right-handed. The vector field N( p) = (p;n(p)) is called the unit normal 
field corresponding to the orientation of M . 


We show N( p) is well-defined, and of class C°° . To show it is well-defined, 
let (I be another coordinate patch about p, belonging to the orientation of M . 
Let g = fi~ l oa be the transition function, and let g(x) — y. Since a — fiog, 


Da(x) = Dfi( y) Dg(x). 
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Then for any v G R n , we have the equation 


[v Da(x)] — [v 


my)] 


0 

Dg(x) 


(Here Dot and D(3 have size n by n — 1, so each of these three matrices has 
size n by n.) It follows that 


det[v Da(x)] = det[v D/3(y )] • det Dg(x). 

Since det Dg > 0, we conclude that [v J9a(x)] has positive determinant if 
and only if [v D(3(y)\ does. 

To show that N is of class C °° , we obtain a formula for it. As motivation, 
let us consider the case n — 3: 


EXAMPLE 1. Given two vectors a and b in R 3 , one learns in calculus that 
their cross product c = a x b is perpendicular to both, that the frame (c, a, b) 
is right-handed, and that ||c|| equals V^a, b). The vector c is, of course, the 
vector with components 



'0.2 

£>2 


'ai 

&r 


' 0\ 

bx 1 

Ci = det 

.a 3 

&3 . 

, C2 = — det 

.a 3 

f>3. 

, C3 = det 

.o 2 

(N 


It follows that if M is an oriented 2-manifold in R 3 , and if a : U — * V is a 
coordinate patch on M belonging to the orientation of M , and if we set 

da da 

c ~di 7 X dxi' 

then the vector n = c/ ||cj| gives the corresponding unit normal to M . See 
Figure 38.1. 



Figure 38.1 
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There is a formula similar to the cross product formula for determining n 
in general, as we now show. 

Lemma 38.3. Given independent vectors xi,...,x n _! in R n , 
let X be the n by n~ \ matrix X = [xi • • • x n _!], and let c be the vector 
c = Ec,e,, where 

d = (-1)’" 1 det X(l, . • . , n). 

The vector c has the following properties: 

(1) c is non-zero and orthogonal to each x, . 

(2) The frame (c,x t , . . . , x„_i) is right-handed. 

(3) ||c|| = V (X). 

Proof. We begin with a preliminary calculation. Let xj, . . . , x„_i be 
fixed. Given a £ R n , we compute the following determinant; expanding by 
cofactors of the first column, we have: 

n 

det[a xi ■ ■ • x n _i] = ^ <2j(— 1) ,_1 det X(l, n) 

*=i 


= (a,c). 

This equation contains all that is needed to prove the theorem. 

(1) Set a — Xj. Then the matrix [axi • • • x n _i] has two identical columns, 
so its determinant vanishes. Thus (x,-,c) = 0 for all i, so c is orthogonal to 
each Xj. To show c ^ 0 , we note that since the columns of X span a space 
of dimension n — 1, so do the rows of X . Hence some set consisting of n — 1 
rows of X is independent, say the set consisting of all rows but the i th . Then 
Cj ^ 0; whence c ^ 0. 

(2) Set a = c. Then 

det[c Xi • • • x n _i] = (c,c) = ||c|| 2 > 0. 


Thus the frame (c,Xj, . . . , x„_i) is right-handed. 

(3) This equation follows at once from Theorem 21.4. Alternatively, one 
can compute the matrix product 



x„_i]‘ r ■ [c Xi •• • x„-l] = 


■|| c || 2 0 ■ 

0 X tv ■ X 


Taking determinants and using the formula in (2), we have 

IWr* = He|P^(X) 2 . 

Since ||c|| ^ 0, we conclude that ||c]| = V{X). □ 
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Corollary 38.4. If M is an oriented n — 1 manifold in R n , then 
the unit normal vector N( p) corresponding to the orientation of M is 
a C°° function of p. 


Proof If a : U — ► V is a coordinate patch on M about p, let 
c *( x ) = (“ l)* -1 det Da{ 1, n)(x) 

for x (E U , and let c(x) = Ec,(x)e t -. Then for all p € V, we have 

^(p) = (p; c(x)/||c(x)|| ), 

where x = a -1 (p); this function is of class C°° as a function of p. □ 

Now we interpret the integral of an n - 1 form in terms of vector fields. 
If G is a vector field in R n , then G corresponds under the “translation map” 
Aj-i to a certain n - 1 form u in R n . (See Theorem 31.1.) It turns out that 
the integral of u over an oriented n — 1 manifold M equals the integral over M, 
with respect to volume, of the normal component of the vector field G. That 
is the substance of the following lemma: 


Lemma 38.5. Let M be a compact oriented n- 1 manifold in R n ; 
let N be the corresponding unit normal vector field. Let G be a vector 
field defined in an open set U of R n containing M. If we denote the 
general point of R" by y, this vector field has the form 

G(y) = (y\9(y)) = (y;£^(y)ei); 

it corresponds to the n - 1 form 


Then 


" = d yi*--'*dy i A---Ady n . 

i~ 1 


/ «= / (G,N) 

J M JM 


dV. 


Note that if we replace M by — M , then the integral changes sign. 

This replacement has the effect of replacing N by -N, so that the integral 
f M (G,N) dV also changes sign. 

Proof. We give two proofs of this theorem. The first relies on the results 
of §36 and the second does not. 
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First proof. By Theorem 36.2, we have 

f AdF, 

JM 

where A(p) is the value of u;(p) on an orthonormal basis for T p (M) that 
belongs to the natural orientation of this tangent space. We show that A = 
(G, N), and the proof is complete. 

Let (p;ai), . (p;a n _i) be an orthonormal basis for T p (M) that be- 
longs to its natural orientation. Let A be the matrix A — [ai • • • a„_i]; and 
let c be the vector c = Kc,e,-, where 

d = (-1) 1 * 1 det A(\, n). 

By the preceding lemma, the vector c is orthogonal to each a,-, and the frame 
(c,a 1? . .., a n _i) is right-handed, and 

Ilc|| = V{A) = [det(j4 tr • A )] 1/2 = [det/„_!] 1/2 = 1. 

Then N = (p; c ) is the unit normal to M at p corresponding to the orientation 
of M . Now by Theorem 27.7, we have 

dyi A • - - A dy { A • • • A dj/ n ((p;ai), . . . , (p;a„_i)) = det A( 1, 



Then 


A (p) = A(l, n) 

»=i 


= Y2 ff.(p) • c i- 

i= 1 

Thus A = (G y N), as desired. 

Second proof. Since the integrals involved in the statement of the the- 
orem are linear in u and G, respectively, it suffices to prove the theorem in 
the case where the set 

C — M fl (Support ijj) 

lies in a single coordinate patch oc : U — ► V belonging to the orientation of M . 
We compute the first integral as follows: 


f u = [ a*u 

JM Jlnt U 

= [ [V'(-l) i ' 1 (^ oa)det J Da(l, ...,T, ..., 7i)], 

Jlnt U i=l 
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by Theorem 32.2. To compute the second integral, set c = Ec.ej, where 
Ci - (-1)' -1 det Da(l, .. ., n). 

If N is the unit normal corresponding to the orientation, then as in the pre- 
ceding corollary, TV^c^x)) = (a(x);c(x)/ ||c(x)||). We compute 


f (G,N) dV = f (Goa,Noa) • V(Da) 

JM J Int U 

= / (goa,c) since \\c\\ = V(Da), 
J Int u 

= f l^2(9i oa)(~l) t_1 det Da(l, 

./Int U 7~7 


Unt U “J 

The lemma follows. □ 


• » »)]■ 


Now we interpret the integral of an n-form in terms of scalar fields. The 
interpretation is just what one might expect: 

Lemma 38.6. Let M be a compact n-manifold in R n , oriented 
naturally. Let u = h dxi A - ■ - A dx n be an n-form defined in an open set 
of R n containing M. Then h is the corresponding scalar field, and 

f u = f h dV. 

Jm Jm 


Proof. First proof. We use the results of §36. We have 

/ lj= f XdV, 

Jm Jm 

where A is obtained by evaluating u on an orthonormal basis for T p (M) that 
belongs to its natural orientation. Now a belongs to the orientation of M 
if det Da > 0; thus the natural orientation of T p (M) consists of the right- 
handed frames. The usual basis for T p (M ) = T p (R 71 ) is one such frame, and 
the value of u on this frame is h. 


Second proof. It suffices to consider the case where the set 
M n (Support u ;) is covered by a coordinate patch a : U — ► V belonging 
to the orientation of M . We have by definition 


I a* u = I (ho a) det Da, 

Jm J Int u J Int u 



h dV = 


lint U 


(ho a) V(Da). 
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Now V(Da) = I det Da\ = det Da, since a belongs to the natural orientation 
of M. □ 


We note that the integral f M h dV in fact equals the ordinary integral 
of h over the bounded subset M of R n . For if A = M — dM , then A is open 
in R n , and the identity map i : A — ► A is a coordinate patch on M, belonging 
to its natural orientation, that covers all of M except for a set of measure 
zero in M. By Theorem 25.4, 

f h dV = f (ho i)V(Di) = f h. 

Jm Ja Ja 

The latter is an ordinary integral; it equals Im h because dM has measure 
zero in R n . 

We now examine, for an 7i-manifold M in R n , naturally oriented, what 
the induced orientation of dM looks like. We considered the case n = 3 in 
Example 4 of §34. A result similar to that one holds in general: 

Lemma 38.7. Let M be an n-manifold in R 7 *. If M is oriented 
naturally , then the induced orientation of dM corresponds to the unit 
normal field N to dM that points outwards from M at each point of dM . 

The inward normal to dM at p is the velocity vector of a curve that 
begins at p and moves into M as the parameter value increases. The outward 
normal is its negative. 

Proof. Let a : U — ► V be a coordinate patch on M about p belonging 
to the orientation of M. Then det Da > 0. Let b : R n_1 -► R” be the map 


b(x i , . . . ) 2,‘n — l) — (*^ 1 ? • • • » — 1 > 0) • 

The map c*o = a o b is a coordinate patch on dM about p. It belongs to the 
induced orientation of dM if n is even, and to its opposite if n is odd. Let N 
be the unit normal field to dM corresponding to the induced orientation of 
dM; let N( p) = (p;n(p)). Then 


det[(— l) n n Da Q \ > 0, 


which implies that 


det[J9a 0 


n] = det[ 


da 
dx i 


da 

dx n - 1 


n] < 0. 


On the other hand, we have 


det Da = det[ 


da 

dx\ 


da 


da 

dx n 


]> 0 . 


dx n - 1 
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The vector dajdx n is the velocity vector of a curve that begins at a point of 
dM and moves into M as the parameter increases. Thus n is the outward 
normal to dM at p. □ 


Theorem 38.8 (The divergence theorem). Let M be a compact 
n-manifold in R n . Let N be the unit normal vector field to dM that 
points outwards from M. If G is a vector field defined in an open set of 
R n containing M, then 

f (div G) dV = [ (G,N) dV. 

JM J dM 

Here the left-hand integral involves integration with respect to 7i-volume, 
and the right-hand integral involves integration with respect to n — 1 volume. 


Proof . Given G, let tJ = (3 n -\ G be the corresponding n — 1 form. 
Orient M naturally and give dM the induced orientation. Then the normal 
field N corresponds to the orientation of dM , by Lemma 38.7, so that 

/ f (G,N)dV, 

J dM J dM 


by Lemma 38.5. According to Theorem 31.1, the scalar field div G corresponds 
to the n-form du; that is, du = (div G)dx\ A • • • A dx n . Then Lemma 38.6 
implies that 



(div G) dK 


The theorem follows from Stokes’ theorem. □ 


In R 3 , the divergence theorem is sometimes called Gauss’ theorem. 

Stokes’ theorem for 2-manifolds in R 3 

There is one more situation in which we can translate the general Stokes’ 
theorem into a theorem about vector fields. It occurs when M is an oriented 
2-manifold in R 3 . 


Theorem 38.9 (Stokes’ theorem — classical version). Let M be 
a compact orientable 2-manifold in R 3 . Let N be a unit normal field 
to M. Let F be a C°° vector field defined in an open set about M. If dM 
is empty, then 

[ (curl F,N) dV = 0. 

JM 
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IfdM is non-empty, let T be the unit tangent vector field to dM chosen 
so that the vector W{ p) = ./V(p) x T(p) points into M from dM . Then 


h 

JM 


(curl F , N) 



ds. 


Proof. Given F, let lj = a\F be the corresponding 1-form. Then 
according to Theorem 31.2, the vector field curl F corresponds to the 2-form 
du. Orient M so that N is the corresponding unit normal field. Then by 
Lemma 38.5, 

f du — ! (curl F, N) dV. 

JM JM 

On the other hand, if dM is non-empty, its induced orientation corresponds to 
the unit tangent field T. (See Example 5 of §34.) It follows from Lemma 38.1 
that 

( (F,T)ds. 

dM 

The theorem now follows from Stokes’ theorem. □ 



EXERCISES 


1. Let G be a vector field in R 3 - 0. Let S 2 (r) be the sphere of radius r in 
R 3 centered at 0. Let N r be the unit normal to S 7 (r) that points away 
from the origin. If div G(x) = l/||x[|, and if 0 < c < d, what can you 
say about the relation between the values of the integral 



, Nr) dV 


for r = c and r = d? 

2. Let G be a vector field defined in A = R” — 0 with div G — 0 in A. 

(a) Let Mi and A /2 be compact n-manifolds in R n , such that the origin 
is contained in both Mi — dMi and A /2 — d A/ 2 . Let N% be the unit 
outward normal vector field to dM t , for i — 1,2. Show that 



[Hint: Consider first the case where A /2 = B n (e) and is contained 
in Mi — dMi. See Figure 38.2.) 
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Figure 38.2 


(b) Show that as M ranges over all compact 71-manifolds in R n for which 
the origin is not in dM, the integral 


f (G,N)dV, 

JQM 


where N is the unit normal to dM pointing outwards from M, has 
only two possible values. 

3. Let G be a vector field in B — R n — p — q with div G = 0 in B. As M 
ranges over all compact 7i-manifolds in R" for which p and q are not in 
dM, how many possible values does the integral 


/ (G,N) 

JQM 


have? (Here N is the unit normal to dM pointing outwards from M.) 
4. Let rj be the 71 — 1 form in A — R u — 0 defined by the equation 

n 

7 ] — (~l)' _1 /« dxi A • • • A dxi A • • • A dx n , 


where fi(x) = x,/||x|| m . Orient the unit ball B n naturally, and give 
5 n_1 = dB n the induced orientation. Show that 


/ r ) = v(S n ~ 1 y 
J s «- 1 


[Hint: If G is the vector field corresponding to 7/, and N is the unit 
outward normal field to .S’" -1 , then (G,N) = 1.] 
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5. Let S be the subset of R 3 consisting of the union of: 

(i) the 2-axis, 

(ii) the unit circle x 2 + y 2 = 1,2 = 0, 

(iii) the points (0, y, 0) with y > 1. 

Let A be the open set R 3 — .S of R 3 . Let C\ , C 2 , D\ , D 2 , D 3 be the 
oriented 1-manifolds in A that are pictured in Figure 38.3. Suppose that 
F is a vector field in A, with curl F = 0 in A, and that 



and 



ds = 7. 


What can you say about the integral 



for i = 1,2,3? Justify your answers. 



Figure 38.3 




Closed Forms and Exact Forms 


In the applications of vector analysis to physics, it is often important to know 
whether a given vector field F in R 3 is the gradient of a scalar field /. If 
it is, F is said to be conservative, and the function / (or sometimes its 
negative) is called a potential function for F. Translated into the language 
of forms, this question is just the question whether a given 1-form to in R 3 is 
the differential of a 0-form, that is, whether u is exact. 

In other applications to physics, one wishes to know whether a given 
vector field G in R 3 is the curl of another vector field F. Translated into the 
language of forms, this is just the question whether a given 2-form u in R 3 is 
the differential of a 1-form, that is, whether u is exact. 

We study here the analogous question in R”. If lj is a fc-form defined 
in an open set A of R”, then a necessary condition for u to be exact is the 
condition that u be closed, i.e., that du = 0. This condition is not in general 
sufficient. We explore in this chapter what additional conditions, either on A 
or on both A and u;, are needed in order to ensure that u is exact. 
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§39. THE POINCARE LEMMA 


Let A be an open set in R n . We show in this section that if A satisfies 
a certain condition called star- convexity, then any closed form u on A is 
automatically exact. This result is a famous one called the Poincare lemma. 

We begin with a preliminary result: 

Theorem 39.1 (Leibnitz’s rule). Let Q be a rectangle in R"; let 
f : Q x[a,b] — ► R be a continuous function. Denote f by f(x,t) forx G Q 
and t G [a, 6]. Then the function 


■F’( x )= f /(*,<) 

J t=a 


is continuous on Q. Furthermore, if dffdxj is continuous on Q x [a, b], 
then 

dF t \ r b * 

(X) = L ^7 (x,<) - 


This formula is called Leibnitz’s rule for differentiating under the 
integral sign. 

Proof . Step 1. We show that F is continuous. The rectangle Q x [a, 6] 
is compact; therefore / is uniformly continuous on Q x [a,b]. That is, given 
e > 0, there is a 6 > 0 such that 


|/(xi,*i) - /(x 0 ,<o)| < € whenever \(xi,ti) ~ (x Q ,t 0 )\ < S. 


It follows that when |xj — xq| < b, 


\F(xi) - F(x 0 )\ 


< 



\f{x u t) - f(x 0 ,t ) I 


< e(b-a). 


Continuity of F follows. 

Step 2. In calculating the integral and derivatives involved in Leibnitz’s 
rule, only the variables Xj and t are involved; all others are held constant. 
Therefore it suffices to prove the theorem in the case where n = 1 and Q is 
an interval [c,(fj in R. 

Let us set, for x G [c,d], 
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We wish to show that F'(x) exists and equals G(x). For this purpose, we 
apply (of all things) the Fubini theorem. We are given that D^f is continuous 
on [c,d] x [a, b\. Then 


rX = XQ /-X=X o ct — b 

/ G(x)= / / DJ(x,t) 

Jx=c J x—c Jt—a 

rt=b rx—Xo 

= / / DJ(x,t) 

Jt=a J x~c 


= f t/(*o,<) -/(c,<)] 

Jt=a 

= F(x 0 ) - F(c); 


the second equation follows from the Fubini theorem, and the third from the 
fundamental theorem of calculus. Then for x (E [c,d], we have 

J* G = F(x) - F(c). 

Since G is continuous by Step 1, we may apply the fundamental theorem of 
calculus once more to conclude that 


G(x) = F\x). □ 


We now obtain a criterion for determining when two closed forms dif- 
fer by an exact form. This criterion involves the notion of a differentiable 
homotopy. 

Definition. Let A and B be open sets in R n and R m , respectively; 
let 51, h : A — ► B be C°° maps. We say that g and h are differentiably 
homotopic if there is a C°° map H : A x / — * B such that 

H (x, 0) = g{x) and Lf(x, 1) = h(x) 

for x (E A. The map H is called a differentiable homotopy between g 
and h . 

For each t, the map x — ► H(x,t) is a C°° map of A into B\ if we think 
of t as “time,” then H gives us a way of “deforming” the map g into the 
map h, as t goes from 0 to 1. 
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Theorem 39.2. Let A and B be open sets in R n and R m , respec- 
tively. Let g,h : A — ► B be C°° maps that are differentiably homotopic. 
Then there is a linear transformation 

V :Q k+1 (B) ->Q*(A), 

defined for k > 0, such that for any form r? of order k > 0, 

dVrj *f Vdrj = h*g — g*r], 
while for a form f of order 0, 

Pdf — h* } - g* }. 


This theorem implies that if 77 is a closed form of positive order, then h*rj 
and <7*77 differ by an exact form, since h*g — g*rj = dV 77 if 77 is closed. On 
the other hand, if / is a closed 0-form, then h* f — g* f = 0. 

Note that d raises the order of a form by 1 , and V lowers it by 1 . Thus 
if 77 has order k > 0, all the forms in the first equation have order k; and all 
the forms in the second equation have order 0. Of course, V f is not defined 
if / is a 0-form. 


Proof. Step 1. We consider first a very special case. Given an open 
set A in R", let U be a neighborhood of A x / in R rt+1 , and let a ,/5 : A — ► U 
be the maps given by the equations 


a(x) = (x,0) and /?(x) = (x, 1). 

(Then a and ft are differentiably homotopic.) We define, for any k + 1 form 77 
defined in U , a it- form Prj defined in A, such that 

dPy + Pdy = (5*7) — a*rj if order 77 > 0, 


(*) 


Pdf — j3* f — a* f if order f = 0. 


To begin, let x denote the general point of R n , and let t denote the general 
point of R. Then dx 1 , . . . , dx n , dt are the elementary 1-forms in R n+1 . If g 
is any continuous scalar function in Ax I, we define a scalar function lg on A 
by the formula 


(Zfl)( x ) 



Then we define P as follows: If k > 0 , the general k + 1 form i) in R n+1 can 
be written uniquely as 


■> = £//<**/ + ^ gj dxj A dt. 
IP [J] 
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Here I denotes an ascending k + 1 tuple, and J denotes an ascending fc-tuple, 
from the set {1, . . . , 7i}. We define P by the equation 

Pn = Y. p U, dx,) + ^2 dxj A dt ), 

[/) [J] 


where 


P(fi dxi) = 0 and P(gj dxj A dt) = (~l) k (lgj) dxj . 


Then Pr) is a fc-form defined on the subset A of R n . 

Linearity of P follows at once from the uniqueness of the representation 
of T) and linearity of the integral operator 2. 

To show that Pr) is of class C°°, we need only show that the function Ig 
is of class C°°; and this result follows at once from Leibnitz’s rule, since g is 
of class C°° . 

Note that in the special case k = 0, the form 77 is a 1-form and is written 
as 


n 

g = fi dxi + g dt ; 

i=l 


in this case, the tuple J is empty, and we have 


Pr) = 0 -f- P(g dt) = Ig. 


Although the operator P may seem rather artificial, it is in fact a rather 
natural one. Just as d is in some sense a “differentiation operator,” the 
operator P is in some sense an “integration operator,” one that “integrates 1 ) 
in the direction of the last coordinate.” An alternate definition of P that 
makes this fact clear is given in the exercises. 

Step 2. We show that the formulas 

P(f dxj) = 0 and P(g dxj A dt) = (-l) k (Ig) dxj 

hold even when I is an arbitrary k + 1 tuple, and J is an arbitrary k- tuple, 
from the set {1, . . . , n}. The proof is easy. If the indices are not distinct, then 
these formulas hold trivially, since dxj = 0 and dxj = 0 in this case. If the 
indices are distinct and in ascending order, these formulas hold by definition. 
Then they hold for any sets of distinct indices, since rearranging the indices 
changes the values of dxj and dxj only by a sign. 
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Step 3. We verify formula (*) of Step 1 in the case k = 0. We have 

P(df) = P(J2 §~d Xl ) + P(^dt) 

i - 1 3 

= 0 + (— 1)°I(§£) 

= fofi — foa 
= /?*/-«•/, 

where the third equation follows from the fundamental theorem of calculus. 

Step 4. We verify formula (*) in the case k > 0. Note that because Ot 
is the map a(x) = (x,0), then 


a*(dxi ) = doti = dxi for i = 1, . . . , n, 


a*(dt) = da n+ i = 0. 


A similar remark holds for j3* . 

Now because d and P and a* and /3* are linear, it suffices to verify our 
formula for the forms f dxi and g dxj A dt. We first consider the case 
T] — f dxj. Let us compute both sides of the equation. The left side is 


dPrj + Pdg = d( 0 ) 4 - P{drj) 

= tE P( ^ dx > A dxi )i + P{ % dt A dxi ) 

j=i 3 

= 0 + (-l) k+l P(^dx, Adt) by Step 2, 

= [f o/3 - f oa]dx!. 


The right side of our equation is 


(3*7/ - a*7) = (/ o (3){3*{dxi ) - (/ o a)ot*(dxi) 
= [f ° (3 - f o a]dxj. 


Thus our result holds in this case. 
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We now consider the case when rj = g dxj A dt. Again, we compute both 
sides of the equation. We have 

(**) d(Prj) = d[(-\) k (lg) dxj)} 

n 

= (~ l ) k Y D &9) dx i A dxj' 

3 = 1 


On the other hand, 


n 

dr] = (Dj 9) d x i A dxj A dt + (D n+ ig) dt A dxj A dt , 

i=i 

so that by Step 2, 


(* * *) P(dr]) - (-l) i+1 Yh J ( D j9) dx j A dxj. 

3 = 1 

Adding (**) and (* * *) and applying Leibnitz’s rule, we see that 

d(Prj) + P(drj) = 0 . 

On the other hand, the right side of the equation is 

j3*(g dxj A dt) - a*(g dxj A dt) = 0, 

since (3*(dt ) = 0 and a*(dt) = 0. This completes the proof of the special case 
of the theorem. 

Step 5. We now prove the theorem in general. We are given C°° maps 
g,h : A — ► B, and a differentiable homotopy H : A x / — > B between 
them. Let ot, (3 : A —+ A x I be the maps of Step 1 , and let P be the linear 
transformation of forms whose properties are stated in Step 1. We then define 
our desired linear transformation V : fi* +1 (i?) — + ft* (A) by the equation 

Vt)= P(H*r]). 

See Figure 39.1. Since H*r] is a k + l form defined in a neighborhood of A x/, 
then P(H*r]) is a fc-form defined in A. 

Note that since H is a differentiable homotopy between g and h y 


}[ o a = g and H o (3 = h. 
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Then if k > 0, we compute 

dPq + Pdr) = dP(H'r) ) + P(H'di 7 ) 

= dP{H'ri)+P(dH'n) 

= P'(H’rf) - a’(H'ri) by Step 1, 

= h* 7 ] — g* 77 , 


as desired. An entirely similar computation applies if k = 0. □ 

Now we can prove the Poincare lemma. First, a definition: 

Definition. Let A be an open set in R n . We say that A is star-convex 
with respect to the point p of A if for each x £ A, the line segment joining x 
and p lies in A. 
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Figure 39.2 

EXAMPLE 1. In Figure 39.2, the set A is star-convex with respect to the 
point p, but not with respect to the point q. The set B is star-convex with 
respect to each of its points; that is, it is convex. The set C is not star-convex 
with respect to any of its points. 


Theorem 39.3 (The Poincare lemma). Let A be a star-convex 
open set in R n . If u is a closed k-form on A, then u is exact on A. 

Proof. We apply the preceding theorem. Let p be a point with respect 
to which A is star-convex. Let h : A — ► A be the identity map and let 
g : A — ► A be the constant map carrying each point to the point p. Then g 
and h are differentiably homotopic; indeed, the map 

H(x,t) = th(x) + (1 - t)g(x) 

carries A x / into A and is the desired differentiable homotopy. (For each t , 
the point H(x , t) lies on the line segment between h(x) = x and <z(x) = p, so 
that it lies in A.) We call H the straight-line homotopy between g and h. 

Let V be the transformation given by the preceding theorem. If / is a 
0-form on A, we have 

V(df) = h*f -g*f = foh~fog. 

Then if df = 0, we have for all x 6 A, 


0 = f(h(x)) - f(g(x)) = f(x) - f( p), 
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so that / is constant on A. 

If a; is a A;-form with k > 0, we have 

dPu + Vdu = h*u - g*u. 

Now h*LJ = CJ because h is the identity map, and g*u = 0 because g is a 
constant map. Then if du = 0, we have 

dV cj — cj, 

so that cj is exact on A. □ 

Theorem 39.4. Let A be a star-convex open set in R n . Let cj be 
a closed k-form on A. If k > l, and if g and 770 are two k - 1 forms 
on A with dg = cj = dg 0 , then 


g= go + dd 

for some k - 2 form 0 on A. If k = 1, and if f and f 0 are two 0 -forms 
on A with df = u = df 0 , then f = f 0 + c for some constant c. 

Proof. Since d( g—go) — 0, the form g— g 0 is a closed form on A. By the 
Poincare lemma, it is exact. A similar comment applies to the form f-fo . □ 


EXERCISES 

1. (a) Translate the Poincare lemma for fc-forms into theorems about scalar 

and vector fields in R 3 . Consider the cases k = 0, 1, 2, 3. 

(b) Do the same for Theorem 39.4. Consider the cases k = 1,2,3. 

2. (a) Let g : A — ► B be a difieomorphism of open sets in R n , of class C°° . 

Show that if A is homologically trivial in dimension k, so is B. 

(b) Find an open set in R 2 that is not star-convex but is homologically 
trivial in every dimension. 

3. Let A be an open set in R n . Show that A is homologically trivial in 
dimension 0 if and only if A is connected. [ Hint : Let p € A. Show that 
if df — 0, and if x can be joined by a broken-line path in A to p, then 
/(x) = /( p). Show that the set of all x that can be joined to p by a 
broken-line path in A is open in A.] 

4. Prove the following theorem; it shows that P is in some sense an operator 
that integrates in the direction of the last coordinate: 

Theorem. Let A be open in R"; let g be a k+ 1 form defined in an 
open set U of R" +1 containing Ax I. Given t 6 I, let a t : A —> U 
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be the “slice” map defined by <*t(x) = (x,t). Given fixed vectors 
(xjvJ, ... , (x;v fc ) in T x (R n ), let 

(y; w-) = (<* t )*(x;v,), 


for each t. Then (y;w,) belongs to 7^,(R" +1 ); and y = ( x,t ) is a 
function oft, butwi =(v,, 0) is not. ( See Figure 39.3.) Then 


(Pr/)(x)((x;v 1 ), 


(x;vj) = 



*7(y)((y;Wj), , (y; w fc ),(y; ®n+l )) ■ 



Figure 39.3 
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§40. THE DeRHAM GROUPS OF PUNCTURED EUCLIDEAN SPACE 


We have shown that an open set A of R n is homologically trivial in all di- 
mensions if it is star-convex. We now consider some situations in which A 
is not homologically trivial in all dimensions. The simplest such situation 
occurs when A is the punctured euclidean space R n — 0. Earlier exercises 
demonstrated the existence of a closed 71 — 1 form in R n — 0 that is not exact. 
Now we analyze the situation further, giving a definitive criterion for deciding 
whether or not a given closed form in R n — 0 is exact. 

A convenient way to deal with this question is to define, for an open set A 
in R”, certain vector spaces H k (A) that are called the deRham groups of A. 
The condition that A be homologically trivial in dimension k is equivalent to 
the condition that II k (A) be the trivial vector space. We shall determine the 
dimensions of these spaces in the case A = R n — 0. 

To begin, we consider what is meant by the quotient of a vector space by 
a subspace. 

Definition. If V is a vector space, and if W is a linear subspace of V, 
we denote by V/W the set whose elements are the subsets of V of the form 

v + W = {v + w|wE W}. 

Each such set is called a coset of V, determined by W. One shows readily 
that if vj - V 2 G W, then the cosets vi + W and V 2 4~ W are equal, while 
if vi — V 2 £ W, then they are disjoint. Thus V/W is a collection of disjoint 
subsets of V whose union is V . (Such a collection is called a partition of V.) 
We define vector space operations in V/W by the equations 

(v i + W) 4- (v 2 4- W) = (v x -f- v 2 ) 4- W, 
c(v 4- W) = (cv) + W. 

With these operations, V/W becomes a vector space. It is called the quotient 
space of V by W . 

We must show these operations are well-defined. Suppose Vi + W = 
\[ 4- W and V 2 4 -W = v 2 4- W. Then vi — vi and v 2 — v' 2 are in W, so that 
their sum, which equals (vi 4- V 2 ) — (vi 4- v 2 ), is in W. Then 

(vi + v 2 ) 4 W = (vi 4- v' 2 ) 4- W. 

Thus vector addition is well-defined. A similar proof shows that multiplication 
by a scalar is well-defined. The vector space properties are easy to check; we 
leave the details to you. 
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Now if V is finite-dimensional, then so is V jW ; we shall not however 
need this result. On the other hand, V/W may be finite-dimensional even in 
cases where V and W are not. 

Definition. Suppose V and V' are vector spaces, and suppose W and 
W' are linear subspaces of V and V\ respectively. If T : V — ► V' is a linear 
transformation that carries W into W', then there is a linear transformation 

f : V/W - V'/W' 

defined by the equation T(v + W ) = T(v) + W'\ it is said to be induced 
by T . One checks readily that T is well-defined and linear. 

Now we can define deRham groups. 

Definition. Let A be an open set in R n . The set Q fc (^4) of all fc-forms 
on A is a vector space. The set C k {A) of closed Ar-forms on A and the set 
E k (A ) of exact A;-forms on A are linear subspaces of Q k (A). Since every 
exact form is closed, E k (A) is contained in C k (A). We define the deRham 
group of A in dimension k to be the quotient vector space 

H k (A) = C k (A)/E k (A). 

If u is a closed fc-form on A (i.e., an element of C*(.4)), we often denote its 
coset cj 4- E k (A) simply by {w}. 

It is immediate that H k (A ) is the trivial vector space, consisting of the 
zero vector alone, if and only if A is homologically trivial in dimension k. 

Now if A and B are open sets in R" and R m , respectively, and if g ; A — ► B 
is a C°° map, then g induces a linear transformation g* : Q k (B) — *■ of 

forms, for all k. Because g* commutes with d, it carries closed forms to closed 
forms and exact forms to exact forms; thus g* induces a linear transformation 

g* :H k (B)-> H k (A) 

of deRham groups. (For convenience, we denote this induced transformation 
also by g* , rather than by g* .) 

Studying closed forms and exact forms on a given set A now reduces to 
calculating the deRham groups of A. There are several tools that are used 
in computing these groups. We consider two of then here. One involves the 
notion of a homotopy equivalence. The other is a special case of a general 
theorem called the Mayer-Vietoris theorem. Both are standard tools in 
algebraic topology. 
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Theorem 40.1 (Homotopy equivalence theorem). Let A and B 
be open sets in R n and R m , respectively. Let g : A — * B and h : B — ► A 
be C°° maps. If goh : B — ► B is differentiably homotopic to the identity 
map i B of B, and if ho g : A —> A is differentiably homotopic to the 
identity map i A of A, then g* and h* are linear isomorphisms of the 
deRham groups. 

If g o h equals i B and hog equals i A , then of course g and h are 
diffeomorphisms. If g and h satisfy the hypotheses of this theorem, then they 
are called (differentiable) homotopy equivalences. 

Proof. If rj is a closed Ar-form on A, for k > 0, then Theorem 39.2 
implies that 

(h°g)'Ti-(i A YT) 

is exact. Then the induced maps of the deRham groups satisfy the equation 


*•(*•({?}) = to), 

so that g* o h* is the identity map of H k (A) with itself. A similar argument 
shows that h* og* is the identity map of H k (B). The first fact implies that g* 
maps H k (B) onto H k (A ), since given {g} in H k (A), it equals g*{h*{g}). 
The second fact implies that g* is one-to-one, since the equation #*{u;} = 0 
implies that /i*(<7*{w}) = 0, whence {cj} = 0. 

By symmetry, h* is also a linear isomorphism. □ 


In order to prove our other major theorem, we need a technical lemma: 

Lemma 40.2. Let U and V be open sets in R n ; let X = U U V ; 
and suppose A = U C\V is non-empty. Then there exists a C°° function 
(j) : X — ► [0, 1] such that <j> is identically 0 in a neighborhood of U — A 
and (j> is identically 1 in a neighborhood of V - A. 

Proof. See Figure 40.1. Let {<£,•} be a partition of unity on X dominated 
by the open covering {U, V}. Let Si = Support (f>i for each i. Divide the 
index set of the collection into two disjoint subsets J and K , so that for 
every i G J , the set Si is contained in U , and for every i G K , the set Si 
is contained in V. (For example, one could let J consist of all i such that 
Si C U, and let K consist of the remaining i.) Then let 


^( x ) = 

i£K 
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The local finiteness condition guarantees that <f> is of class C°° on X , since 
each x £ X has a neighborhood on which (f> equals a finite sum of C°° 
functions. 

Let a E U — A; we show (f> is identically 0 in a neighborhood of a. First, 
we choose a neighborhood W of a that intersects only finitely many sets Si. 
From among these sets Si, take those whose indices belong to K , and let D 
be their union. Then D is closed, and D does not contain the point a. The 
set W — D is thus a neighborhood of a, and for each i £ K, the function <f>i 
vanishes on W — D. It follows that — 0 for x (E W — D. 

Since 

1 - 4>(x) = Y <M X )> 

*€•/ 

symmetry implies that the function 1 — <f) is identically 0 in a neighborhood 

oiV-A. □ 

Theorem 40.3 ( Mayer- Vietoris—special case). Let U and V be 
open sets in R n with U and V homologically trivial in all dimensions. 
Let X = U U V; suppose A = U n V is non-empty. Then H°(X) is 
trivial , and for k > 0 , the space H k+1 (X) is linearly isomorphic to the 
space H k (A). 

Proof. We introduce some notation that will be convenient. If B , C 
are open sets of R" with B C C , and if 77 is a A:- form on C , we denote by 
77|i? the restriction of p to B. That is, p\B — j* 77, where j is the inclusion 
map j : B —* C. Since j * commutes with d , it follows that the restriction of 
a closed or exact form is closed or exact, respectively. It also follows that if 

A C B C C, then (p\B)\A = ij\A. 
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Step 1. We first show that H°(X ) is trivial. Let / be a closed 0- 
form on X. Then f\U and f\V are closed forms on U and V, respectively. 
Because U and V are homologically trivial in dimension 0, there are constant 
functions C\ and c 2 such that f\U = Cj and f\V = c 2 . Since U D V is 
non-empty, C\ = Ci\ thus / is constant on X . 

Step 2. Let <j> : X — + [0, 1] be a C°° function such that <j) vanishes in a 
neighborhood U' of U — A and 1 — <j5> vanishes in a neighborhood V' of V — A. 
For k > 0, we define 

S :Q k (A)-+n k+1 (X) 

by the equation 

{ d<bA0J on A , 

0 on U’UV. 

Since d</) = 0 on the set U' U V 7 , the form is well-defined; since A and 
IT U V 7 are open and their union is X, it is of class C°° on X. The map S is 
clearly linear. It commutes with the differential operator d, up to sign, since 

{ (-l)d<f> A du on 

} =-S(du). 

0 on V U V ) 

Then S carries closed forms to closed forms, and exact forms to exact forms, 
so it induces a linear transformation 

6 : H k (A)-> H t+ \X). 


We show that 6 is an isomorphism. 

Step 3. We first show that 6 is one-to-one. For this purpose, it suffices 
to show that if u; is a closed fc-form in A such that ^(cj) is exact, then u is 
itself exact. 

So suppose = d9 for some £-form 0 on X. We define A:-forms 
and CJ 2 on U and V, respectively, by the equations 


= 


(fw on A, 
0 on U 1 , 


and 


U>2 


-{ 


( 1 — <f>)u on A , 


0 


on 


V* 


Then 0 J 1 and 0 J 2 are well-defined and of class C°° . See Figure 40.2. 
We compute 

d<f> Aw + 0 on A, 

0 on U'- 


duji 


{ 
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Figure 40.2 

the first equation follows from the fact that dw = 0. Then 

<L>i = 6{u>)\U = de\U. 

It follows that — 0\U is a closed fc-form on U . An entirely similar proof 
shows that 

du 2 = -d0 \V, 

so that lj 2 4- d\V is a closed fc-form on V. 

Now U and V are homologically trivial in all dimensions. If k > 0, this 
implies that there are k — 1 forms 7/i and r} 2 on U and V, respectively, such 
that 

— 9\U — drji and lj 2 -f 9\V — dr] 2 . 

Restricting to A and adding, we have 

u>i | A -+■ cj 2 \A = dr]i\A + dr) 2 \ A, 

which implies that 


fa + (1 - <j>)u = d(rh\A + r] 2 \A). 
Thus u is exact on A. 

If k — 0, then there are constants C\ and c 2 such that 
Wi - 9\U = Ci and lj 2 + 9\V = c 2 . 


Then 

<pw -f (1 - 4>)u) — ui\A +u 2 \A = Ci + c 2 . 

Step 4' We show 6 maps H k (A ) onto H k+1 (X). For this purpose, it 
suffices to show that if Tf is a closed k + 1 form in X, then there is a closed 
Ar-form a; in A such that 77 - is exact. 
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Figure 40.3 


Given 77 , the forms T)\U and 7]\V are closed; hence there are fc-forms 9\ 
and 0 2 on U and V respectively, such that 

d9i = rf\U and d& 2 = rj\V. 

Let u be the &-form on A defined by the equation 

w = 9M - 0 2 \A; 

then lj is closed because dw = d0 x \A - d0 2 \A = r)\A - t)\A = 0. We define a 
Ar-form 6 on X by the equation 

( (1 — 4>)9i + 4>9 2 on 
0i on U', 

0 2 on V 7 . 

Then 9 is well-defined and of class C°° . See Figure 40.3. We show that 

7] — 6(u) = d9\ 

this completes the proof. 

We compute d9 on A and U' and V' separately. Restricting to A, we 
have 

dO\A = [—d(j> A ($ 1 \A) + (1 - 4>)(de x | A)] + [d<j> A (0 2 \ A) + 4>(d9 2 \A)] 


= <}>7)\A -f (1 - 4>)v\A + d<f> A [9 2 \A - 9\\A\ 
= t]\A + d<f> A (-cj) 

= ti\A-6(lj)\A. 
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Restricting to U' and to V' , we compute 


d9\U' = d6 x \U' = 7/| U' = r)\U ' - S(cj)\U\ 
dO \V* = d0 2 \V' = t/|V' = 7/|V' - 6(cj)\V', 
since b(u)\U' = 0 and f){u)\V' — 0 by definition. It follows that 

dO = 7]~ S((jj) y 


as desired. □ 

Now we can calculate the deRham groups of punctured euclidean space. 
Theorem 40.4. Let n > I . Then 

. f 0 

dim/f fc (R" — 0) = { 


Proof. Step 1. We prove the theorem for n = 1. Let A = R 1 — 0; 
write A — Ao U A x , where Ao consists of the negative reals and A\ consists 
of the positive reals. If cj is a closed fc-form in A , with k > 0, then oj\Aq and 
u\A\ are closed. Since Ao and A\ are star-convex, there are k — 1 forms 7/o 
and Tfi on Aq and A \ , respectively, such that dr\i — w\Ai for i = 0,1. Define 
7 / = 7/0 on Ao and rf — 7/1 on A\. Then 7/ is well-defined and of class C °° , and 
drf = lj. 

Now let fo be the 0-form in A defined by setting fo(x) = 0 for x G Ao 
and fo(x ) = 1 for x 6 A x . Then fo is a closed form, and fo is not exact. 
We show the coset {fo} forms a basis for H°(A). Given a closed 0-form / 
on A , the forms f\Ao and f\A x are closed and thus exact. Then there are 
constants Co and c x such that f\Ao = Co and f\A x = c x . It follows that 

f(x) = c x fo(x) + Co 

for x G A. Then {/} = Ci{/o}, as desired. 

Step 2. If B is open in R n , then B x R is open in R n+1 . We show that 
for all k, 

dimH k (B) = dimH k (B xR). 

We use the homotopy equivalence theorem. Define g : B — ► B x R by 
the equation #(x) = (x, 0), and define h : B x R — ► B by the equation 
/i(x, s) = x. Then hog equals the identity map of B with itself. On the 


for k ^ n - 
for k — n — 1 . 
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other hand, g o h is differentiably homotopic to the identity map of B x R 
with itself; the straight-line homotopy will suffice. It is given by the equation 

H ((x, s), t) = /(x, s) + (1 - *)(x, 0) = (x, st). 

Step 3. Let n > 1. We assume the theorem true for n and prove it 
for n -{- 1 . 

Let U and V be the open sets in R n+1 defined by the equations 

V = R" +1 _ {(0, . . . , 0,/)|/ > 0}, 

V = R n+1 — {(0, . . . , 0, t)|t < 0}. 

Thus U consists of all of R n+1 except for points on the half-line 0 x H 1 , and V 
consists of all of R n+1 except for points on the half-line 0 x L 1 . Figure 40.4 
illustrates the case n — 3. The set A = U fl V is non-empty; indeed, A 
consists of all points of R n+1 = R" x R not on the line 0 x R; that is, 

A = (R n - 0) x R. 


Figure 40-4 


If we set X — U U V, then 

X = R n+1 — 0. 

The set U is star-convex relative to the point p = (0, . . . , 0, — 1) of R" +1 , and 
the set V is star-convex relative to the point q = (0, ..., 0,1), as you can 
readily check. It follows from the preceding theorem that H°(X) is trivial, 
and that 

dim H k+1 (X) = dim H k (A) for k > 0. 

Now Step 2 tells us that H k (A ) has the same dimension as H k ( R n — 0), and 
the induction hypothesis implies that the latter has dimension 0 if k ^ n — 1, 
and dimension 1 if k — n — 1. The theorem follows. □ 




Let us restate this theorem in terms of forms. 
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Theorem 40.5. Let A = R" — 0, with n > 1. 

(a) If k ^ n — 1, then every closed k-form on A is exact on A. 

(b) There is a closed n - 1 form q 0 on A that is not exact. If q is any 
closed n — 1 form on A, then there is a unique scalar c such that q — cqo 
is exact. □ 

This theorem guarantees the existence of a closed n — 1 form in R” — 0 
that is not exact, but it does not give us a formula for such a form. In the 
exercises of the last chapter, however, we obtained such a formula. If 7]o is 
the n — 1 form in R n — 0 given by the equation 

n 

T}o - ]P(-1 )*“ 1 /i dxi A • *• A dxi A • • • A dx n , 

i=i 

where /i(x) = X{j ||x|| n , then it is easy to show by direct computation that 
r) o is closed, and only somewhat more difficult to show that the integral of qo 
over 5' n_1 is non-zero, so that by Stokes’ theorem it cannot be exact. (See 
the exercises of §35 or §38.) Using this result, we now derive the following 
criterion for a closed n — 1 form in R n — 0 to be exact: 

Theorem 40.6. Let A = R n — 0, with n > 1. If q is a closed n - 1 
form in A, then q is exact in A if and only if 

f q = 0. 

Js n ~ l 


Proof. If q is exact, then its integral over 5" -1 is 0, by Stokes’ theorem. 
On the other hand, suppose this integral is zero. Let qo be the form just 
defined. The preceding theorem tells us that there is a unique scalar c such 
that q — cqo is exact. Then 



V 


= C I 

Js n ~ 1 


Vo, 


by Stokes’ theorem. Since the integral of qo over S n 1 is not 0, we must have 
c — 0. Thus q is exact. □ 
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EXERCISES 


1. (a) Show that V/W is a vector space. 

(b) Show that the transformation T induced by a linear transformation T 
is well-defined and linear. 

2. Suppose ai, a n is a basis for V whose first k elements from a basis 

for the linear subspace W. Show that the cosets a*+i + W , . . . , a n + W 
form a basis for V/W . 

3. (a) Translate Theorems 40.5 and 40.6 into theorems about vector and 

scalar fields in R n — 0, in the case n — 2. 

(b) Repeat for the case n = 3. 

4. Let U and V be open sets in R n ; let X = U U V; assume that A = 
U H V is non-empty. Let 6 : H k (A ) — ► H k+1 (X) be the transformation 
constructed in the proof of Theorem 40.3. What hypotheses on H t (U) 
and H*(V) are needed to ensure that: 

(a) 6 is one-to-one? 

(b) The image of 6 is all of H k+1 (X)? 

(c) H°(X) is trivial? 

5. Prove the following: 

Theorem. Let p and q be two points of R n ; let n > 1. Then 


dim H k (R n — p — q) 


JO if k ^ n — 1, 
l 2 if k = n — 1. 


Proof. Let S = {p,q}- Use Theorem 40.3 to show that the open set 
R n + 1 — S x H 1 of R" +1 is homologically trivial in all dimensions. Then 
proceed by induction, as in the proof of Theorem 40.4. 

6. Restate the theorem of Exercise 5 in terms of forms. 

7. Derive a criterion analogous to that in Theorem 40.6 for a closed n — 1 
form in R n — p — q to be exact. 

8. Translate results of Exercises 6 and 7 into theorems about vector and 
scalar fields in R n — p — q in the cases n = 2 and n = 3. 
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§41. DIFFERENTIABLE MANIFOLDS AND RIEMANNIAN MANIFOLDS 


Throughout this book, we have dealt with submanifolds of euclidean space 
and with forms defined in open sets of euclidean space. This approach has 
the advantage of conceptual simplicity; one tends to be more comfortable 
dealing with subspaces of R" than with arbitrary metric spaces. It has the 
disadvantage, however, that important ideas are sometimes obscured by the 
familiar surroundings. That is the case here. 

Furthermore, it is true that, in higher mathematics as well as in other sub- 
jects such as mathematical physics, manifolds often occur as abstract spaces 
rather than as subspaces of euclidean space. To treat them with the proper 
degree of generality requires that one move outside R". 

In this section, we describe briefly how this can be accomplished, and 
indicate how mathematicians really look at manifolds and forms. 

Differentiable manifolds 

Definition. Let M be a metric space. Suppose there is a collection of 
homeomorphisms a, : Ui — ► Vi, where Ui is open in H* or R fc , and Vi is open 
in M , such that the sets Vi cover M . (To say that cti is a homeomorphism 
is to say that a,- carries Ui onto V* in a one-to-one fashion, and that both a, 
and a , -1 are continuous.) Suppose that the maps a* overlap with class C°°\ 
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this means that the transition function a, 1 o a, is of class C°° whenever 
v t n Vj is nonempty. The maps a^ are called coordinate patches on Af , 
and so is any other homeomorphism a : U — ► V, where U is open in H* 
or R fc , and V is open in M , that overlaps the a* with class C°° . The metric 
space M , together with this collection of coordinate patches on Af , is called 
a differentiable A>manifold (of class C°°). 

In the case k = 1, we make the special convention that the domains of 
the coordinate patches may be open sets in L 1 as well as R 1 or H 1 , just as we 
did before. 

If there is a coordinate patch a : U — ► V about the point p of M such 
that U is open in R*, then p is called an interior point of M . Otherwise, p is 
called a boundary point of M . The set of boundary points of Af is denoted 
dM. If a : U — ► V is a coordinate patch on Af about p, then p belongs to 
dM if and only if U is open in H fc and p = a(x) for some x £ R fc_1 x 0. The 
proof is the same as that of Lemma 24.2. 

Throughout this section, Af will denote a differentiable k-manifold. 

Definition. Given coordinate patches cto^i on Af, we say they over- 
lap positively if det/^a]" 1 o ao) >0. If M can be covered by coordinate 
patches that overlap positively, then M is said to be orientable. An orien- 
tation of M consists of such a covering of M , along with all other coordinate 
patches that overlap these positively. An oriented manifold consists of a 
manifold M together with an orientation of M. 

Given an orientation {<*,•} of M , the collection {a* o r), where r : R* — * 
R fc is the reflection map, gives a different orientation of M; it is called the 
orientation opposite to the given one. 

Suppose Af is a differentiable ^-manifold with non-empty boundary. Then 
dM is a differentiable k — 1 manifold without boundary. The maps a o b, 
where a is a coordinate patch on M about p £ dM and b : R k ~ x — *■ R* is the 
map 

b{x x , . .., z*_i) = (xi, ..., xjb-i,0), 

are coordinate patches on dM . The proof is the same as that of Theorem 24.3. 

If the patches ao and a x on M overlap positively, so do the coordinate 
patches aoo6 and a x ob on dM ; the proof is that of Theorem 34.1. Thus if Af 
is oriented and dM is nonempty, then dM can be oriented simply by taking 
coordinate patches on M belonging to the orientation of Af about points of 
dM , and composing them with the map b. If k is even, the orientation of 
dM obtained in this way is called the induced orientation of dM ; if k is 
odd, the opposite of this orientation is so called. 

Now let us define differentiability for maps between two differentiable 
manifolds. 
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Definition. Let M and N be differentiable manifolds of dimensions k 
and n, respectively. Suppose A is a subset of M; and suppose / : A — ► N . 
We say that / is of class C°° if for each x E A, there is a coordinate patch 
Q : U — ► V on M about x, and a coordinate patch (3 : W — *■ Y on N about 
y = f(x ), such that the composite /3~ l o f o a is of class C°° } as a map of 
a subset of R fc into R". Because the transition functions are of class C °° , 
this condition is independent of the choice of the coordinate patches. See 
Figure 41.1. 



Figure 411 


Of course, if M or N equals euclidean space, this definition simplifies, 
since one can take one of the coordinate patches to be the identity map of 
that euclidean space. 

A one-to-one map / : M — ► N carrying M onto N is called a diffeo- 
morphism if both / and / -1 are of class C°°. 

Now we define what we mean by a tangent vector to M . Since we have 
here no surrounding euclidean space to work with, it is not obvious what a 
tangent vector should be. 

Our usual picture of a tangent vector to a manifold M in R n at a point p 
of M is that it is the velocity vector of a C°° curve 7 : \a,b] —>■ M that passes 
through p. This vector is just the pair (p;D7(^o)) where p = 7 (^ 0 ) an d ^7 
is the derivative of 7 . 

Let us try to generalize this notion. If M is an arbitrary differentiable 
manifold, and 7 is a C°° curve in M , what does one mean by the “derivative” 
of the function 7 ? Certainly one cannot speak of derivatives in the ordinary 
sense, since M does not lie in euclidean space. However, if a : U — ► V is a 
coordinate patch in M about the point p, then the composite function a -1 07 
is a map from a subset of R 1 into R fc , so we can speak of its derivative. We 
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Figure J^1.2 


can thus think of the “derivative” of 7 at t 0 as the function v that assigns, to 
each coordinate patch a about the point p, the matrix 

v(o) = D{a~ l o 7)(*o), 


where p — ct(fo). 

Of course, the matrix D(a ~ 1 07) depends on the particular coordinate 
patch chosen; if c*o and aq are two coordinate patches about p, the chain rule 
implies that these matrices are related by the equation 

v(c*i) = Dg(x 0 ) • v(a 0 ), 

where g is the transition function g = oq 1 o a 0 , and x 0 = ^ ee 

Figure 41 . 2 . 


The pattern of this example suggests to us how to define a tangent vector 
to M in general. 

Definition. Given p € M, a tangent vector to M at p is a function v 
that assigns, to each coordinate patch a : U — ► V in M about p , a column 
matrix of size A: by 1 which we denote v(a). If Oq an d <*1 are two coordinate 
patches about p, we require that 

(*) 


v(a,) = Dg(x 0 ) -v(a 0 ), 
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where g = a] -1 o a: 0 is the transition function and x 0 = «o 1 (p)- The entries 
of the matrix v(a) are called the components of v with respect to the 
coordinate patch a. 

It follows from (*) that a tangent vector v to M at p is entirely determined 
once its components are given with respect to a single coordinate system. It 
also follows from (*) that if v and w are tangent vectors to M at p , then we 
can define av + 6w unambiguously by setting 

(av + 6w)(a) = av(a) 4- bw(ot) 

for each a. That is, we add tangent vectors by adding their components in 
the usual way in each coordinate patch. And we multiply a vector v by a 
scalar similarly. 

The set of tangent vectors M at p is denoted T P {M)\ it is called the 
tangent space to M at p. It is easy to see that it is a A:-dimensional space; 
indeed, if a is a coordinate patch about p with a(x) = p, one checks readily 
that the map v — ► (x;v(o;)), which carries T p (M) onto 7 ^(R fc ), is a linear 
isomorphism. The inverse of this map is denoted by 

a. : T x ( R l ) - T p (M). 

It satisfies the equation a* (x; v(a)) = v. 

Given a C°° curve 7 : [a, 6] — * M in M, with 7(^0) = p , we define the 
velocity vector v of this curve corresponding to the parameter value to by 
the equation 

v(«) = D(a~ l o7)(t 0 ); 

then v is a tangent vector to M at p. One readily shows that every tangent 
vector to M at p is the velocity vector of some such curve. 

REMARK. There is an alternate approach to defining tangent vectors that is 
quite common. We describe it here. 

Suppose v is a tangent vector to M at the point p of M. There is 
associated with v a certain operator X v on real-valued C°° functions defined 
near p. This operator is called the derivative with respect to v; it arises 
from the following considerations: 

Suppose / is a C°° function on M defined in a neighborhood of p, and 
suppose v is the velocity vector of the curve 7 : [a,h] — + M corresponding 
to the parameter value <0, where 7(*o) = p* Then the derivative d(f o 7 )/dt 
measures the rate of change of / with respect to the parameter t of the curve. 
If Oc : U —* V is a coordinate patch about p, with c*(x) = p, we can express 
this derivative as follows: We write / 0 7 = (/ o a) o (a -1 o 7), and compute 

^P*(<o) = D(f 0 a)(x) • D(a~' o T )(«„), 


= D(f o £*)(x) • v(a). 
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See Figure 41.3. Note that this derivative depends only on / and the velocity 
vector v, not on the particular curve j. 

This formula leads us to define the operator X v as follows: 

If V is a tangent vector to M at p, and if / is a C°° real-valued function 
defined near p, choose a coordinate patch oc : U — + V about p with Ck(x) = p, 
and define the derivative of / with respect to v by the equation 

A" v (/) = D(f oa)(x)*v(o). 

One checks readily that this number is independent of the choice of a. One 
checks also that X v + W = X y -f X w and X cy = cX v . Thus the sum of vectors 
corresponds to the sum of the corresponding operations, and similarly for a 
scalar multiple of a vector. 

Note that if M — R fc , then the operator X y is just the directional deriva- 
tive of / with respect to the vector v. 

The operator X v satisfies the following properties, which are easy to 
check: 

(1) (Locality). If / and g agree in a neighborhood of p , then X y (f ) = X v (g). 

(2) (Linearity). X y (af + bg ) = aX y (f) + bX y (g). 

(3) (Product rule). X„(f ■ g) = X„(f)g(p) + f(p)X v (g). 

These properties in fact characterize the operator X y . One has the fol- 
lowing theorem: Let X be an operator that assigns to each C°° real-valued 
function f defined near p a number denoted X(f), such that X satisfies con- 
ditions (l)-(3). Then there is a unique tangent vector v to M at p such that 
X — X w . The proof requires some efFort; it is outlined in the exercises. 

This theorem suggests an alternative approach to defining tangent vectors. 
One could define a tangent vector to M at p to be simply an operator X 
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satisfying conditions (l)-(3). The set of these operators is a linear space if we 
add operators in the usual way and multiply by scalars in the usual way, and 
thus it can be identified with the tangent space to M at p. 

Many authors prefer to use this definition of tangent vector. It has the 
appeal that it is “intrinsic”; that is, it does not involve coordinate patches 
explicitly. 

Now we define forms on M . 

Definition. An l-form on M is a function u assigning to each pe M, 
an alternating ^-tensor on the vector space T p (M). That is, 

u(p)eA l (T p (M)) 

for each p € M . 

We require u to be of class C°° in the following sense: If a : U — ► V 
is a coordinate patch on M about p , with a(x) = p , one has the linear 
transformation 

T = a* :T x (R k )->T p (M) 
and the dual transformation 

T m :A‘(T p (M))-~ A* (T x { R*)). 

If u> is an ^-form on M , then the ^-form ot*u is defined as usual by setting 

(a* y )(x) = r( W (ri). 

We say that u is of class C°° near p if a*u is of class C°° near x in the 
usual sense. This condition is independent of the choice of coordinate patch. 
Thus u is of class C°° if for every coordinate patch oon M, the form a*U) 
is of class C°° in the sense defined earlier. 

Henceforth, we assume all our our forms are of class C °° . 

Let Q*(M) denote the space of ^-forms on M. Note that there are no 
elementary forms on M that would enable us to write u in canonical form, 
as there were in R n . However, one can write a*u in canonical form as 

= Yhfi dxj ’ 

[/] 

where the dxj are the elementary forms in R k . We call the functions // the 
components of u with respect to the coordinate patch a. They are of course 
of class C°° . 
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Definition. If u is an ^-form on M , we define the differential of u> as 
follows: Given p 6 M, and given tangent vectors v l5 v^ +1 to M at p , 
choose a coordinate patch a : U — ► V on M about p with ct(x) = p. Then 
define 


<M*>)(vi, ...,v4i) = d(a*uO(x)((x; Vl (a)), . . . , (x; v /+1 (a))). 


That is, we define du> by choosing a coordinate patch a, pulling w back 
to a form a*u in R fc , pulling vi, . v <+1 back to tangent vectors in R fc , 
and then applying the operator d in R fc . One checks that this definition is 
independent of the choice of the patch a. Then du is of class C°° . 

We can rewrite this equation as follows: Let a,- = v;(a). The preceding 
equation can be written in the form 

du?(p)(a*(x;aj), ..., <**(x;a /+1 )) = d(a*u)(x)((x; aj), (x;a /+1 )). 

This equation says simply that a*(du) = d(a* uj). Thus one has an alternate 
version of the preceding definition: 

Definition. If u is an £-form on A/, then du is defined to be the unique 
t + 1 form on M such that for every coordinate patch a on M, 

a*(dcj) — d(a*u>). 

Here the “d” on the right side of the equation is the usual differential opera- 
tor d in R fc , and the “d” on the left is our new differential operator in M. 

Now we define the integral of a A:-form over M . We need first to discuss 
partitions of unity. Because we assume M is compact, matters are especially 
simple. 

Theorem 41.1. Let M be a compact differentiable manifold. Given 
a covering of M by coordinate patches, there exist functions <f>i : M — ► R 
of class C°° , for i = 1, . . . , t, such that: 

(1) <t>i(p) > 0 for each p e M . 

(2) For each i, the set Support <j>i is covered by one of the given 
coordinate patches. 

(3) = 1 for each p 6 M. 

Proof. Given p € M , choose a coordinate patch a : U —*■ V about p. 
Let Q:(x) = p\ choose a non-negative C°° function / : U — ► R whose support 
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is compact and is contained in U , such that / is positive at the point x. 
Define ip p : M — * R by setting 



/(«"*(!/)) 

0 


if y € V, 

otherwise. 


Because f(e^~ 1 (y)) vanishes outside a compact subset of V, the function iftp 
is of class C°° on M . 

Now ipp is positive on an open set U p about p. Cover M by finitely many 
of the open sets U p , say for p = pi, . . . , pt. Then set 


A = and & = (i/A)^.- D 

j~i 


Definition. Let M be a compact, oriented differentiable fc-manifold. 
Let u> be a A:-form on M. If the support of u lies in a single coordinate patch 
a : U —► V belonging to the orientation of M , define 

I u — f a*u. 

Jm j Int U 

In general, choose <f> i, . . . , <j)t in the preceding theorem and define 

/ « = £[/ *w]. 

JM | = 1 JM 

The usual argument shows this integral is well-defined and linear. 


Finally, we have: 

Theorem 41.2 (Stokes’ theorem). Let M be a compact, oriented 
differentiable k-manifold. Let lj be a k - 1 form on M. If dM is non- 
empty, give dM the induced orientation; then 

I du) = I u. 

JM J dM 

If dM is empty, then f M du = 0. 

Proof. The proof given earlier goes through verbatim. Since all the 
computations were carried out by working within coordinate patches, no 
changes are necessary. The special conventions involved when k = 1 and 
dM is a 0-manifold are handled exactly as before. □ 
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Not only does Stokes’ theorem generalize to abstract differentiable man- 
ifolds, but the results in Chapter 8 concerning closed forms and exact forms 
generalize as well. Given M , one defines the deRham group H k (M) of M in 
dimension k to be the quotient of the space of closed fc-forms on M by the 
space of exact A;-forms. One has various methods for computing the dimen- 
sions of these spaces, including a general Mayer- Vietoris theorem . If M is 
written as the union of the two open sets U and V in M, it gives relations 
between the deRham groups of M and U and V and U fl V. These topics 
are explored in [B-T]. 

The vector space H k (M) is obviously a diffeomorphism invariant of M . 
It is an unexpected and striking fact that it is also a topological invariant 
of M . This means that if there is a homeomorphism of M with N, then 
the vector spaces H k (M ) and H k (N) are linearly isomorphic. This fact is a 
consequence of a celebrated theorem called deRham’s theorem, which states 
that the algebra of closed forms on M modulo exact forms is isomorphic to 
a certain algebra, defined in algebraic topology for an arbitrary topological 
space, called the “cohomology algebra of M with real coefficients.” 

Riemannian manifolds 

We have indicated how Stokes’ theorem and the deRham groups generalize 
to abstract differentiable manifolds. Now we consider some of the other topics 
we have treated. Surprisingly, many of these do not generalize as readily. 

Consider for instance the notions of the volume of a manifold M , and of 
the integral J M f dV of a scalar function over M with respect to volume. 
These notions do not generalize to abstract differentiable manifolds. 

Why should this be so? One way of answering this question is to note that, 
according to the discussion in §36, one can define the volume of a compact 
oriented A:-manifold M in R n by the formula 

v{M) = / u v , 

JM 

where u v is a “volume form” for M , that is, u v is a fc-form whose value is 1 on 
any orthonormal basis for T P (M) belonging to the natural orientation of this 
tangent space. In this case, T p {M) is a linear subspace of 7^>(R n ) = pxR”, so 
T p (M) has a natural inner product derived from the dot product in R”. This 
notion of a volume form cannot be generalized to an arbitrary differentiable 
manifold M because we have no inner product on T p (M) in general, so we do 
not know what it means for a set of vectors to be orthonormal. 

In order to generalize our definition of volume to a differentiable mani- 
fold M , we need to have an inner product on each tangent space T p (M): 

Definition. Let M be a differentiable fc-manifold. A Riemannian 
metric on M is an inner product {v, w) defined on each tangent space T p {M)\ 
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it is required to be of class C°° as a 2-tensor field on M . A Riemannian 
manifold consists of a differentiable manifold M along with a Riemannian 
metric on M . 

(Note that the word “metric” in this context has nothing to do with the 
use of the same word in the phrase “metric space.”) 

Now it is true that for any differentiable manifold M , there exists a Rie- 
mannian metric on M. The proof is not particularly difficult; one uses a 
partition of unity. But the Riemannian metric is certainly not unique. 

Given a Riemannian metric on M, one has a corresponding volume func- 
tion V(vi , . . . , Vfc) defined for fc-tuples of vectors of T p (M). (See the exercises 
of §21.) Then one can define the integral of a scalar function just as before: 

Definition. Let M be a compact Riemannian manifold of dimension k. 
Let / : M — * R be a continuous function. If the support of / is covered by a 
single coordinate patch ct : U — ► V, we define the integral of f over M by 
the equation 

f f dV = f (foa)V(a*(x;e l ) 1 ...,a*(x;e k )). 

JM J Int U 

The integral of f over M is defined in general by using a partition of unity, 
just as in §25. The volume of M is defined by the equation 

v(M) = f dV. 

Jm 


If M is a compact oriented Riemannian manifold, one can interpret the 
integral f M lj of a fc-form over M as the integral f M A dV of a certain scalar 
function, just as we did before, where A (p) is the value of u(p) on an or- 
thonormal fc-tuple of tangent vectors to M at p that belongs to the natural 
orientation of T p (M) (derived from the orientation of M). If A (p) is identi- 
cally 1, then lj is called the volume form of the Riemannian manifold Af, 
and is denoted by u> v . Then 


v(M) = I u v . 

JM 

For a Riemannian manifold M , a host of interesting questions arise. 
For instance, one can define what one means by the length of a smooth 
parametrized curve 7 : [a, 6] — ► M\ it is just the integral 
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The integrand is the norm of the velocity vector of the curve 7 , defined of 
course by using the inner product on T p (M). Then one can discuss “ geo- 
desics,” which are “curves of minimal length” joining two points of M . One 
goes on to discuss such matters as “curvature.” All this is dealt with in a sub- 
ject called Riemannian geometry, which I hope you are tempted to investigate! 

One final comment. As we have indicated, most of what is done in this 
book can be generalized, either to abstract differentiable manifolds or to Rie- 
mannian manifolds. One aspect that does not generalize is the interpretation 
of Stokes’ theorem in terms of scalar and vector fields given in §38. The reason 
is clear. The “translation functions” of §31, which interpret /z-forms in R n as 
scalar fields or vector fields in R n for certain values of k , depend crucially 
on having forms that are defined in R n , not on some abstract manifold M . 
Furthermore, the operators grad and div apply only to scalar and vector fields 
in R"; and curl applies only in R 3 . Even the notion of a “normal vector” to a 
manifold M depends on the surrounding space, not just on M . 

Said differently, while manifolds and differential forms and Stokes’ theo- 
rem have meaning outside euclidean space, classical vector analysis does not. 

EXERCISES 

1 . Show that if v E T P (M), then v is the velocity vector of some C°° curve 7 
in M passing through p. 

2 . (a) Let v E T P (M). Show that the operator X y is well-defined. 

(b) Verify properties (l)-(3) of the operator X v . 

3. If u> is an £-form on M, show that du> is well-defined (independent of the 
choice of the coordinate patch o;). 

4. Verify that the proof of Stokes’ theorem holds for an arbitrary difFeren- 
tiable manifold. 

5. Show that any compact differentiable manifold has a Riemannian metric. 

* 6 . Let M be a differentiable A:-manifold; let p E M. Let A be an operator 
on C°° real-valued functions defined near p, satisfying locality, linearity, 
and the product rule. Show there is exactly one tangent vector v to M 
at p such that X = X y , as follows: 

(a) Let F be a C°° function defined on the open cube U in R fc consisting 
of all x with |x| < €. Show there are C°° functions g 1 , . . . , < 7 * defined 
on U such that 


F(x) - F(0) = Xjg } {x ) 

j 


for xEtf. [Hint: Set 


9j(x) 


-r* 


F {x 1 , . . ■ , Xj — 1 , tiXj ,0, . . . , 0). 
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Then g } is of class C°° and 

f t=x 3 

= / DjF{x \ , . . . , , t, 0, . . . , 0).] 

J 1=0 

(b) If F and gj are as in (a), show that 

D J F(0) = g J (0 )■ 

(c) Show that if c is a constant function, then X(c) = 0. [Hint: Show 
that X(1 • 1) = 0.] 

(d) Given X, show there is at most one v such that X = X w . [Hint: 
Let a be a coordinate patch about p; let h — a -1 . If X — X V} show 
that the components of v(a) are the numbers X(hi).] 

(e) Given X , show there exists a v such that X = X v . [Hint: Let a 
be a coordinate patch with cr(0) = p; let h — a -1 . Set Vi = X(hi ), 
and let v be the tangent vector at p such that v((*) has components 
Vi, , Vk. Given / defined near p, set F — f o a. Then 

x w (f) = Y / D > F (°) v ’- 

3 

Write F(x) = E_,x_,^(x) + F( 0) for x near 0, as in (a). Then 

/ = h] ° h ) + F w 

3 


in a neighborhood of p. Calculate X[f) using the three properties 
of X.] 




Bibliography 


[A] Apostol, T.M., Mathematical Analysis , 2nd edition, Addison- Wesley, 
1974. 

[A-M-R] Abraham, R., Mardsen, J.E., and Ratiu, T., Manifolds , Tensor 
Analysis, and Applications, Addison-Wesley, 1983, Springer- Verlag, 
1988. 

[B] Boothby, W.M., An Introduction to Differentiable Manifolds and Rie- 
mannian Geometry, Academic Press, 1975. 

[B-G] Berger, M., and Gostiaux, B., Differential Geometry: Manifolds, 
Curves, and Surfaces, Springer- Verlag, 1988. 

[B-T] Bott, R., and Tu, L.W., Differential Forms in Algebraic Topology, 
Springer- Verlag, 1982. 

[D] Devinatz, A., Advanced Calculus, Holt, Rinehart and Winston, 1968. 

[F] Fleming, W., Functions of Several Variables, Addison-Wesley, 1965, 
Springer- Verlag, 1977. 

[Go] Goldberg, R.P., Methods of Real Analysis, Wiley, 1976. 

[G-P] Guillemin, V., and Pollack, A., Differential Topology, Prentice-Hall, 
1974. 

[Gr] Greub, W.H., Multilinear Algebra, 2nd edition, Springer- Verlag, 1978. 

[M] Munkres, J.R., Topology, A First Course, Prentice-Hall, 1975. 

[N] Northcott, D.G., Multilinear Algebra, Cambridge U. Press, 1984. 


359 



360 Bibliography 


[N-S-S] Nickerson, H.K., Spencer, D.C., and Steenrod, N.E., Advanced Cal- 
culus i, Van Nostrand, 1959. 

[Ro] Roy den, H., Real Analysis, 3rd edition, Macmillan, 1988. 

[Ru] Rudin, W., Principles of Mathematical Analysis, 3rd edition, 
McGraw-Hill, 1976. 

[S] Spivak, M., Calculus on Manifolds, Addison- Wesley, 1965. 



Index 


Addition, 

of matrices, 4 
of vectors, 1 
Additivity, 

of integral, 106, 109 
of integral (extended), 125 
of volume, 112 

A k (V), alternating tensors, 229 
basis for, 232 

a*, induced transformation of vectors, 
246 

<**, dual transformation of forms, 267 
Alternating tensor, 229 
elementary, 232 
Antiderivative, 99 
Approaches as a limit, 28 
Area, 179 

of 2-sphere, 216-217 
of torus, 217 

of parametrized-surface, 191 
Arc, 306 

Ascending fc-tuple, 184 


Ball, B n (a ), see n-ball 
Ball, open, 26 
Basis, 2, 10 
for R n , 3 

usual, for tangent space, 249 
B n {a), see n-ball 
Bd A, 29 
Boundary, 

of manifold, 205, 346 

induced orientation, 288, 346 
of set, 29 

dM , see boundary of manifold 
Bounded set, 32 

Cauchy-Schwarz inequality, 9 
Centroid, 

of bounded set, 168 
of cone, 168 
of El, 218 
of half-ball, 169 
of manifold, 218 
of parametrized-manifold, 193 
Chain rule, 56 
Change of variables, 147 


Change of variables theorem, 148 
proof, 161 
Class C°°, 52 
Class C 1 , 50 
Class C\ 

form, 250, 351 
function, 52, 144, 199 
manifold, 196, 200, 347 
manifold-boundary, 206 
tensor field, 248 
vector field, 247 
Closed cube, 30 
Closed form, 259 

not exact, 261, 308, 343 
Closed set, 26 
Closure, 26 
Cofactors, 19 
expansion by, 23 
Column index, 4 
Column matrix, 6 
Column rank, 7 
Column space, 7 
Common refinement, 82 
Compact, 32 

vs. closed and bounded, 33, 38 
Compactness, 
of interval, 32 
of rectangle, 37 
Compact support, 139 
Comparison property, 
of integral, 106 
of integral (extended), 125 
Component function, 28 
Component interval, 81 
Components, 

of alternating tensor, 233 
of form, 249 
Composite function, 
differentiability, 56 
class C r , 58 
C 1 , see class C 1 
Cone, 168 
Connected, 38 
Connectedness, 
of convex set, 39 
of interval, 38 

Conservative vector field, 323 
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Content, 113 
Continuity, 

of algebraic operations, 28 
of composites, 27 
of projection, 28 
of restriction, 27 
Continuous, 27 

Continuously differentiable, 50 
Convex, 39 

Coordinate patch, 196, 201, 346 
Coset, 334 
Covering, 32 
C r , see class C r 
Cramer’s rule, 21 
Cross product, 183, 313 
Cross-section, 121 
Cube, 30 
open, 26 
Curl, 264 

Cylindrical coordinates, 151 

Darboux integral, 89 
deRham group, 335 
of R n - 0, 341 
of R n — p — q, 344 
deRham ’s theorem, 354 
Derivative, 41, 43 
of composite, 56 
vs. directional derivative, 44 
of inverse, 60 
Determinant, 
axioms, 15 
definition, 234 
formula, 234 

geometric interpretation, 169 
of product, 18 
properties, 16 
vs. rank, 16 
of transpose, 19 
df, differential, 253, 255 
Df, derivative, 43 
Diagonal, 36 
DifFeomorphism, 147 
of manifolds, 347 
preserves rectifiability, 154 
primitive, 156 
Differentiable, 41-43 
vs. continuous, 45 
Differentiable homotopy, 325 
Differentiable manifold, 346 
Differentiably homotopic, 325 
Differential, 


of fc-form, 256 
of 0-form, 253 
Differential form, 
on manifold, 351 
on open set in R n , 248 
of order 0, 251 
Differential operator, 256 

as directional derivative, 262 
in manifold, 352 
Dimension of vector space, 2 
Directional derivative, 42 
vs. continuity, 44 
vs. derivative, 44 
in manifold, 349 
Distance from point to set, 34 
Divergence, 263 
Divergence theorem, 319 
Dominated by, 139 
Dot product, 3 
du>, differential, 256 
d(x,C), 34 

dxi, elementary 1-form, 253 
dxi, elementary fc-form, 254 
Dual basis, 222 
Dual space V* , 220 
Dual transformation, 
of forms, 267 

calculation, 269, 273 
properties, 268 
of tensors, 224 

Echelon form, 8 

Elementary alternating tensor, 232 
as wedge product, 237 
Elementary fc-form, 249, 254 
Elementary fc-tensor, 221 
Elementary matrix, 11 
Elementary 1-form, 249, 253 
Elementary permutation, 227 
Elementary row operation, 8 
Entry of matrix, 4 
e-neighborhood, 
of point, 26 
of set, 34 
Exact form, 259 
Extended integral, 121 

as limit of integrals, 123, 130 
as limit of series, 141 
vs. ordinary integral, 127, 129, 140 
properties, 125 
Expansion by cofactors, 23 
Ext A, 29 
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Exterior, 29 

Extreme-value theorem, 34 
Euclidean metric, 25 
Euclidean norm, 4 
Euclidean space, 25 
Even parametrization, 228 

Face of rectangle, 92 
Final point of arc, 306 
Form, see differential form 
Frame, 171 
/', 229 

/ (g> g, tensor product, 223 
Fubini’s theorem, 
for rectangles, 100 
for simple regions, 116 
Fundamental theorem of calculus, 98 
/ A g, wedge product, 238 

Gauss’ theorem, 319 
Gauss-Jordan reduction, 7 
Gradient, 48, 263 
Gradient theorem, 312 
Graph, 97, 114 
Gram-Schmidt process, 180 
Green’s theorem, 308 

Half-ball, 169 
Hemisphere, 192 
H\ H^, 200 

H fc (^4), deRham group, 335 
Homeomorphism, 345 
Homologically trivial, 259 
Homotopy, 

differentiable, 325 
straight-line, 331 
Homotopy equivalence, 336 
Homotopy equivalence theorem, 336 

Identity matrix, 5 
7jt, identity matrix, 5 
Implicit differentiation, 71, 73 
Implicit function theorem, 74 
Improper integral, 121 
Increasing function, 90 
Independent, 2, 10 

Induced orientation of boundary, 288, 
307, 346 

Induced transformation, 
of deRham group, 335 
of quotient space, 335 
of tangent vectors, 246 


Initial point of arc, 306 
Inner product, 3 
Inner product space, 3 
Integrable, 85 

extended sense, 121 
Integral, 

of constant, 87 
of max, min, 105 
over bounded set, 104 
existence, 109, 111 
properties, 106 

extended, see extended integral 
over interval, 89 
over rectangle, 85 
evaluation, 102 
existence, 93 
over rectifiable set, 112 
over simple region, 116 
Integral of form, 

on differentiable manifold, 353 
on manifold in R", 293-294 
on parametrized-manifold, 276 
on open set in R fc , 276 
on 0-manifold, 307 
integral of scalar function, 
vs. integral of form, 299 
over manifold, 210, 212 
over parametrized-manifold, 189 
over Riemannian manifold, 355 
Int A, 29 
Interior, 

of manifold, 205, 346 
of set, 29 

Intermediate-value theorem, 38 
Invariance of domain, 67 
Inverse function, 
derivative, 60 
differentiability, 65 
Inverse function theorem, 69 
Inverse matrix, 13 
formula, 22 

Inversion, in a permutation, 228 
Invertible matrix, 13 
Inward normal, 318 
Isolated point, 27 
Isometry, 120, 174 
preserves volume, 176 
Isomorphism, linear, 6 
Iterated integrals, 103 

Jacobian matrix, 47 
Jordan content, 113 
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Jordan-measurable, 113 

A;- form, see form 
Klein bottle, 285 

Left half-line, 283 
Left-handed, 171 
Left inverse, 12 
Leibnitz notation, 60 
Leibnitz’s rule, 324 
Length, 179 
of interval, 81 
of parametrized-curve, 191 
of vector, 4 
Lie group, 209 
Limit, 28 

of composite, 30 
vs. continuity, 29 
Limit point, 26 
Line integral, 278 
Line segment, 39 
Linear in » th variable, 220 
Linear combination, 2 
Linear isomorphism, 6 
Linear space, 1 
of fc-forms, 255 
Linear subspace, 2 
Linear transformation, 6 
Linearity of integral 
extended, 125 
of form, 295 
ordinary, 106 
of scalar function, 213 
Lipschitz condition, 160 
£*(V), fc-tensors on V, 220 
basis for, 221 
Locally bounded, 133 
Locally of class C r , 199 
L 1 , left half-line, 283 
Lower integral, 85 
Lower sum, 82 

Manifold, 200 

of dimension 0, 201 
without boundary, 196 
Matrix, 4 
column, 6 
elementary, 11 
invertible, 13 
non-singular, 14 
row, 6 
singular, 14 


Matrix addition, 4 
Matrix cofactors, 22 
Matrix multiplication, 5 
Mayer-Vietoris theorem, 337 
Mean-value theorem, 
in R, 49 
in R m , 59 
second-order, 52 
Measure zero, 
in manifold, 213 
in R", 91 
Mesh, 82 
Metric, 25 
euclidean, 25 
Riemannian, 354 
sup, 25 

Metric space, 25 
Minor, 19 

Mixed partials, 52, 103 
Mobius band, 285 
Monotonicity, 
of integral, 106 
of integral (extended), 125 
of volume, 112 
Multilinear, 220 
Multiplication, 
of matrices, 5 
by scalar, 1, 4 

Natural orientation, 
of n-manifold, 286 
of tangent space, 298 
n-ball, B n (a), 207 
as manifold, 208 
volume, 168 

Neighborhood 26, see also 
e-neighborhood 
n-manifold, see manifold 
n — 1 sphere, 207 
as manifold, 208 
volume, 218 

Non-orientable manifold, 281 
Non-singular matrix, 14 
Norm, 4 

Normal field to n — 1 manifold, 
formula, 314 
vs. orientation, 285, 312 

Odd permutation, 228 
Cl k , linear space of fc-forms, 255, 351 
0(n), orthogonal group, 209 
Open ball, 26 
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Open covering, 32 
Open cube, 26 
Open rectangle, 30 
Open set, 26 
Opposite orientation, 
of manifold, 286, 346 
of vector space, 171 
Order (of a form), 248 
Orientable, 281, 346 
Oriented manifold, 281, 346 
Orientation, 

for boundary, 288 
for manifold, 281, 346 
for n — 1 manifold, 285, 312 
for n-manifold, 286 
for 1-manifold, 282 
for vector space, 171, 282 
for 0-manifold, 307 
Orientation- preserving, 
difTeomorphism, 281 
linear transformation, 172 
Orientation- reversing, 
diffeomorphism, 281 
linear transformation, 172 
Orthogonal group, 209 
Orthogonal matrix, 173 
Orthogonal set, 173 
Orthogonal transformation, 174 
Orthonormal set, 173 
Oscillation, 95 
Outward normal, 318 
Overlap positively, 281, 346 

Parallelopiped, 170 
volume, 170, 182 
Parametrized-curve, 48, 191 
Parametrized-manifold, 188 
volume, 188 

Parametrized-surface, 191 
Partial derivatives, 46 

equality of mixed, 52, 103 
second-order, 52 
Partition, 

of interval, 81 
of rectangle, 82 
Partition of unity, 139 
on manifold, 211, 352 
Peano curve, 154 
Permutation, 227 
Permutation group, 227 
<f>i , elementary 1-form, 249 
<f)j, elementary tensor, 221 


Poincare lemma, 331 
Polar coordinate transformation, 54, 
148 

Potential function, 323 
Preserves t th coordinate, 156 
Primitive diffeomorphism, 156 
Product, 
matrix, 5 

tensor, see. tensor product 
wedge, see wedge product 
Projection map, 167 
if} i , elementary alternating tensor, 232 
if} I, elementary fc-form, 249 
Pythagorean theorem for volume, 184 

Quotient space V/W , 334 

Rank of matrix, 7 
Rectangle, 29 
open, 30 

Rectifiable set, 112 
Reduced echelon form, 8 
Refinement of partition, 82 
Restriction, 

of coordinate patch, 207 
of form, 337 
Reverse orientation, see 
opposite orientation 
Riemann condition, 86 
Riemann integral, 89 
Riemannian manifold, 355 
Riemannian metric, 354 
Right-hand rule, 172 
Right-handed, 171 
Right inverse, 12 
R n , 

as metric space, 25 
as vector space, 2 
Row index, 4 
Row matrix, 6 
Row operations, 8 
Row rank, 7 
Row space, 7 

Scalar field, 48, 251 
sgn <7, 228 
£[/], 184 
Ej, 222 

Sign of permutation, 228 
Simple region, 114 
Singular matrix, 14 
Size of matrix, 4 
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Sk, symmetric group, 227 
Skew-symmetric, 265 
S'" -1 (a), see n — 1 sphere 
Solid torus, 151 
as manifold, 208 
volume, 151 
Span, 2, 10 

Sphere, see n — 1 sphere 
Spherical coordinate transformation, 

55, 150 

Stairstep form, 8 
Standard basis, 3 
Star-convex, 330 
Stokes’ theorem, 
for arc, 306 

for differentiable manifold, 353 
for fc-manifold in R n , 303 
for 1-manifold, 308 
for surface in R 3 , 319 
Straight-line homotopy, 331 
Subinterval determined by partition, 82 
Subrectangle determined by partition, 
82 

Subspace, 
linear, 2 

of metric space, 25 
Substitution rule, 144 
Sup metric, 25 
Sup norm, 
for vectors, 4 
for matrices, 5 
Support, 139 
Symmetric group, 227 
Symmetric set, 168 
Symmetric tensor, 229 

Tangent bundle, 248 
Tangent space, 

to manifold, 247, 349 
to R n , 245 
Tangent vector, 

to manifold, 247, 348, 351 
to R n , 245 
Tangent vector field, 
to manifold, 248 
to R", 247 
Tensor, 220 
Tensor field, 

on manifold, 249 
in R n , 248 
Tensor product, 223 
properties, 224 


Topological property, 27 
Torus, 151 
area, 217 
as manifold, 208 
Total volume of rectangles, 91 
T P (M), see tangent space 
T(M), see tangent bundle 
Transition function, 203, 346 
Transpose, 9 
Triangle, 193 
Triangle inequality, 4 
T*, see dual transformation of tensors 

Uniform continuity, 36 
Upper half-space, 200 
Upper integral, 85 
Upper sum, 82 

Usual basis for tangent space, 249 
Vector, 1 

Vector addition, 1 
Vector space, 1 
Velocity vector, 48, 245, 349 
Volume, 

of bounded set, 112 
of cone, 168 
of manifold, 212 

of AT x N, 218 
of n-ball, 168 
of n-sphere, 218 
of parallelopiped, 182 
of parametrized-manifold, 188 
of rectangle, 81 
of Riemannian manifold, 355 
of solid torus, 151 
Volume form, 300 

for Riemannian manifold, 355 
V*, dual space, 220 
V/W , quotient space, 334 
V(X), volume function, 181 

Wedge product, 
definition, 238 
properties, 237 
Width, 81 

Xiy submatrix, 184 

Y a , see parametrized-manifold 



About the Book: 


A substantial course in real analyse * an essential part of the preparation of any 
potential mathematician Analysts on Manifolds is a thorough, class-tested approach 
that begins with the derivative and the Riemann integral for functions of several 
variables, followed bv a treatment of differential forms and a proof of Stokes' 
theorem for manifolds in euclidean space 

The book includes careful treatment of both the inverse function theorem and the 
change of variables theorem for n* dimensional integrals, as well as a proof of the 
Poincare lemma 

intended for students at tf»e senior or first-year graduate level this text includes 
more than 120 illustrations and exercises that range from the straightforward 
to the challenging Tfie book evolved from courses on real analysis taught bv the 
author at the Massachusetts Institute of Technology 


About the Author: 

James R Munkres received his Ph D in mathematics »n 1956 from the University 
of Michigan Professor of Mathematics at the Massachusetts institute of Technology 
since 1966. he is the author of Elemental Linear Algebra i Addison -Wesley. 19641 
and Elements of Algebraic Topology (Addisor Wesley. 1984). 


0701 51035 9/Hardbound/384 po /1990 



ADDISON-W6SLEY PUBLISHING COMPANY 

The Advanced Book Program 

Redwood City California • Mento Park. California 

• Reading. Massachusetts • New York • Don Mills. 
Ontario • Wokingfiam. United Kingdom • 
Amsterdam • Bonn • Sydney • Singapore • Tokyo 

• Madnd • San Juan 



